Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: include nsenter when building util-linux binaries #390

Conversation

the-wondersmith
Copy link
Contributor

@the-wondersmith the-wondersmith commented May 21, 2024

PR just includes nsenter in the subset of built binaries when compiling util-linux. Specifically, having nsenter allows several common tools written in golang to run on Talos.

@smira
Copy link
Member

smira commented May 21, 2024

This seems to be part of linkerd story, what exactly calls nsenter in this case?

@the-wondersmith
Copy link
Contributor Author

the-wondersmith commented May 21, 2024

This seems to be part of linkerd story, what exactly calls nsenter in this case?

linkerd-proxy-init ref. It's not strictly a linkerd-only issue though. It actually stems from a golang quirk, which in turn affects anything that would do cni-similar things on talos.

@smira
Copy link
Member

smira commented May 22, 2024

That is not true for sure, as Talos itself is using CNI and netns without any issues (not inside Talos Linux, but talosctl cluster create in QEMU mode). So calling out to nsenter/iptables is more like solution from 5 years ago, right now goroutine can easily enter netns, do nftables calls as needed (no iptables!), and so on.

This PR will put nsenter to /usr/local/bin, not /usr/bin, so not sure if Linkerd would be able to call it this way.

@the-wondersmith
Copy link
Contributor Author

...

This PR will put nsenter to /usr/local/bin, not /usr/bin, so not sure if Linkerd would be able to call it this way

Is /use/bin not in PATH by default? Assuming it is, it shouldn't be an issue 🙂

@smira
Copy link
Member

smira commented May 22, 2024

I don't know what is the $PATH in that case.

@the-wondersmith
Copy link
Contributor Author

I don't know what is the $PATH in that case.

Far easier to do PATH resolution than to shadow box the go runtime 😅. Are there any objections to merging the change?

@smira
Copy link
Member

smira commented May 22, 2024

Are there any objections to merging the change?

Can you actually get it tested?

@the-wondersmith
Copy link
Contributor Author

...

Can you actually get it tested?

Happy to 🙂

Given that the change only enables one extra output from a build process that is itself already proven and tested, could you please define what additional checks you'd like to see to constitute "tested" in this instance?

@smira
Copy link
Member

smira commented May 22, 2024

Given that the change only enables one extra output from a build process that is itself already proven and tested, could you please define what additional checks you'd like to see to constitute "tested" in this instance?

I mean build & install this extension to Talos, and verify that Linkerd works now.

@the-wondersmith
Copy link
Contributor Author

I mean build & install this extension to Talos, and verify that Linkerd works now.

On it 🫡

...
Can you actually get it tested?

FWIW, here's the diff from dive in visual form:

Current Build Output

00-default-build

Proposed Build

01-proposed-build


For anyone who's curious why the current image appears almost completely empty despite the inclusion of --enable-libmount --enable-libblkid and --enable-fstrim in the current manifest, it's because the install step runs rm -rf /rootfs/usr/local/{include,lib,share} after it completes. Here's what the images look like without those removals.

Current Build Output (without post-install scrubbing)

02-default-without-removal

Proposed Build Output (without post-install scrubbing)

03-proposed-without-removal

@the-wondersmith
Copy link
Contributor Author

@smira Well... that was 8 shades of Less Than Fun™, but I got it done in the end 😅 🎉

Steps:

  1. create a cluster normally (as per the docs): talosctl cluster create
  2. deploy linkerd
  3. immediately get the expected error:

cni-error

  1. destroy ^ that cluster
  2. create new cluster w/ customized images: talosctl cluster create --image='docker.io/thewondersmith/talos:v1.7.2-nsenter' --install-image='docker.io/thewondersmith/talos-installer:v1.7.2-nsenter'
  3. deploy linkerd
  4. success (as expected)

success


All of the images used in testing are public and available for your inspection and independent verification if you'd like:

@smira
Copy link
Member

smira commented May 23, 2024

For anyone who's curious why the current image appears almost completely empty despite the inclusion of --enable-libmount --enable-libblkid and --enable-fstrim in the current manifest, it's because the install step runs rm -rf /rootfs/usr/local/{include,lib,share} after it completes. Here's what the images look like without those removals.

This is expected, as it uses those libraries from the host (they're already there), to make extension size smaller.

@smira
Copy link
Member

smira commented May 23, 2024

success (as expected)

thanks for testing it, I can get this change to the queue of changes to Talos 1.7, so it pops up in 1.7.3

@smira smira self-assigned this May 23, 2024
@smira smira force-pushed the feat-tools-util-linux-include-nsenter branch from 3c95b09 to 9cb7167 Compare May 23, 2024 10:39
@smira
Copy link
Member

smira commented May 23, 2024

/ok-to-test

This should provide enough compatibility for Linkerd CNI.

Signed-off-by: Mark S <[email protected]>
Signed-off-by: Andrey Smirnov <[email protected]>
@smira smira force-pushed the feat-tools-util-linux-include-nsenter branch from 9cb7167 to d1a0ce8 Compare May 23, 2024 11:01
@smira
Copy link
Member

smira commented May 23, 2024

/m

@talos-bot talos-bot merged commit d1a0ce8 into siderolabs:main May 23, 2024
14 checks passed
@smira smira mentioned this pull request May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Backported
Development

Successfully merging this pull request may close these issues.

4 participants