Skip to content

Conversation

@FransUrbo
Copy link
Contributor

Motivation and Context

A previous commit (e865e78), the local keyword was removed because of bashism.
Removing bashisms are correct, however this caused multiple functions using the same variable to overwrite each other.
So this commit make function variables unique in the (now) global namespace.

The intention was to only allow such file systems that are under the root of the root file system to mount - a nested file system hierarchy. This was so that multiple operating systems should be allowed to co-exist in the same pool.

However, this was broken and caused ALL file systems in the pool with the canmount=noauto to be mounted, because the ${fs} variable was used in many places, and got overwritten.

Second commit is to have good shell scripting practices - variable names should be wrapped in {} and be enclosed by ".

Fixes: #17963

Description

Prevent the previously local variables, now global, from being overwritten by another function by renaming them to something unique.

How Has This Been Tested?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@FransUrbo FransUrbo force-pushed the fix_global_variables branch from 5295b17 to 6f50e21 Compare December 1, 2025 18:44
Copy link
Contributor Author

@FransUrbo FransUrbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm perfectly happy to drop this commit.

I can't remember where I saw it, or when, but it's more than ten years ago, that "good shell script practices" was to write variables as ${variable}, not $variable.

Weather that's true or not, it IS annoying to me to see both side by side - PICK ONE!! :D :D.

Copy link
Member

@robn robn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good grief, what a mess. Good work tracking all this down.

One low-key suggestion: I wonder if it would be useful to prefix the "local" variables with an underscore, just to make them easier to see?

I have not tested and don't have this problem, but based on the description and changes I can't obviously see how it could do anything worse, assuming the original logic was fine. Approving on that basis.

Thanks!

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, this script clearly needed some attention. Thanks for the cleanup.

I haven't manually tested this either, but reading through it makes sense and is definitely an improvement.

@behlendorf behlendorf added the Status: Accepted Ready to integrate (reviewed, tested) label Dec 3, 2025
@FransUrbo
Copy link
Contributor Author

Thanx guys, but for complete transparency, I haven't tested it myself!

Now, I could setup a new initrd (which I've done in the past obviously, but it was quite a few years ago :) and do the test on say a Debian GNU/Linux box, but if this is used in many other places - it have at least at some point been used by Debian/kFreeBSD (which is no more unfortunately), Ubuntu and Gentoo and others.. I can't test them all!

However, I do (vaguely!) remember that we had a talk about this being in the ZFS repo itself, instead of letting each dist/os do their own. Our compromise was contrib - "no support, but if you want to use it, here it is". ?

Not meaning we should abandon it completely, but my thinking here is that we do the best "ocular inspection" we can, and if we all think it's ok, then we merge it and we fix any problems users/dists report.. ?

I know for a fact that the ONLY bug that's been reported (or at least the only one I've seen!) about this is that Debian GNU/Linux one, and it was for the testing dist - which is known to be a bit .. "fritzzy" at times :).

@FransUrbo
Copy link
Contributor Author

I'm perfectly happy to test this on at least one system if you want me to, I don't really have much else to do now for the next two weeks :).

One low-key suggestion: I wonder if it would be useful to prefix the "local" variables with an underscore, just to make them easier to see?

I think that would be very sensible thing to do, I'll update the commits.

@FransUrbo FransUrbo force-pushed the fix_global_variables branch from 6f50e21 to b38eec5 Compare December 3, 2025 10:40
@github-actions github-actions bot removed the Status: Accepted Ready to integrate (reviewed, tested) label Dec 3, 2025
@FransUrbo FransUrbo force-pushed the fix_global_variables branch 3 times, most recently from 251022f to 4dc3052 Compare December 3, 2025 11:35
@FransUrbo
Copy link
Contributor Author

I'm starting to wonder if this is actually the problem!

I can't reproduce the problem on Trixie (which, from what I can see, uses the same version as Testing).

@FransUrbo
Copy link
Contributor Author

I've asked the original Debian GNU/Linux submitter for more information, since I can't reproduce the problem (on Trixie, which is almost identical to the version in Testing).

The original bug report doesn't actually state that it is a problem with the initrd, I guess they only assumed it was?

We can hold off on this PR, since it doesn't actually solve the problem. Although, while trying to reproduce it, I noticed another edge case, and some code cleanups that may or may not be appropriate.

Let's wait and see what happens, if this can be reproduced in a better way.

@behlendorf
Copy link
Contributor

Let's wait and see what happens, if this can be reproduced in a better way.

That sounds good. I'd also be okay with applying all of the mechanical cleanup you did in 9873d41 as a starting point. That won't fix the bug, but if you can test it on at least one platform and it passes visual inspection it's pretty low risk.

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Dec 4, 2025
@FransUrbo
Copy link
Contributor Author

Yeah, I've tested the new version, with all the commits applied in the PR.

It at least didn't break anything :). I do think it's prudent to apply the PR (eventually), just to make things easier to maintain.
But then we/I need to reword things here and there.

I'll see if I can get a VM installed with Debian GNU/Linux Testing in the same way just to make sure. I'm starting to believe that it's a problem "elsewhere". As in, not in the initrd.

In a previous commit (e865e78), the
`local` keyword was removed because of bashism.

Removing bashisms are correct, however this caused multiple functions
using the same variable to overwrite each other.

So this commit make function variables unique in the (now) global name
space.

Fixes: openzfs#17963
Signed-off-by: Turbo Fredriksson <[email protected]>
It's considered good practice to:
1) Wrap the variable name in `{}`.
   As in `${variable}` instead of `$variable`.
2) Put variables in `"`.

Also some minor error message tuning.

Signed-off-by: Turbo Fredriksson <[email protected]>
This just to make them easier to see.

Signed-off-by: Turbo Fredriksson <[email protected]>
When a pool is degraded, or needs special action, the `zpool import`
(without pool to import) line will report:
```
  pool: rpool
    id: 01234567890123456789
 state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:
   [..]
```
If the import with the pool name fails, it is supposed to try importing
using the pool ID.

However, the script is also getting the `action` line (and probably `scrub:`
if/when that's available):
  pool; The pool can be imported using its name or numeric identifier.;config:;
which causes issues on consequent import attempts.

Cleanup the information by rewriting the `sed` command line.

Signed-off-by: Turbo Fredriksson <[email protected]>
The file `/etc/default/zfs` is already sourced by the `/etc/zfs/zfs-functions`,
so no need to source it again.

Signed-off-by: Turbo Fredriksson <[email protected]>
The `ZFS_POOL_EXCEPTIONS` variable is used in the initrd script to
boot additional OS file systems besides the root file system.

But it wasn't included as an example in the config files.

Signed-off-by: Turbo Fredriksson <[email protected]>
More code standard changes, where if/then is on different lines.
To have it on the same, or on different lines, can be argued, but
we need to pick one, and try not to mix how to do things.

Signed-off-by: Turbo Fredriksson <[email protected]>
There's no real documenation (which should probably be written!),
so instead document the code the best we can on what's going and
with the mounting of file systems to make future updates easier.

Signed-off-by: Turbo Fredriksson <[email protected]>
@FransUrbo FransUrbo force-pushed the fix_global_variables branch 3 times, most recently from b64c07d to 18e102a Compare December 4, 2025 23:05
Signed-off-by: Turbo Fredriksson <[email protected]>
@FransUrbo FransUrbo force-pushed the fix_global_variables branch from 18e102a to 3836534 Compare December 4, 2025 23:30
@FransUrbo
Copy link
Contributor Author

Can't reproduce the original problem on Testing either..

But the latest version of the script works fine, on both Trixie and Testing/Forky/Sid. So there's that.

Now we need to wait for the original bug reporter to come back with more information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Code Review Needed Ready for review and testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

initramfs-scripts: more elegant canmount property handling?

4 participants