Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat/lower squashfs size #211

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

blaggacao
Copy link

@blaggacao blaggacao commented May 10, 2024

  • refactor: bash+jq instead of python closure
  • chore: anticipate expected delay in tests
  • feat: free caches and log free mem before kexec
  • remove some extra weight

Context

  • Some edge environments have less than the required 1.5 GB RAM
  • One can disable file system support, however that alone is not enough
  • The base system also needs to be trimmed down
  • This PR is contributing to trimming the base system down, regardless of the fs support

Results

  • The potential savings where analyzed on the likes of:
nix-tree /nix/store/0szqjjq5r98063cvmnhq7yrwb6dmxsvy-nixos-system-nixos-installer-23.11pre-git
  • This change reduces the overall size of the unpacked squashfs store in the order of the sum of the savings comments
  • I'm currently testing to go further (removing pearl in addition to fs support) and have arrived at around ~500MB unpacked vs initial some ~1GB unpacked (values are just a ballpark number and probably not yet the end)

Unstable Patch Required

From 0c34c3c822d82c60f06783e834a9c602223ee6a8 Mon Sep 17 00:00:00 2001
From: David <[email protected]>
Date: Fri, 10 May 2024 16:57:00 +0200
Subject: [PATCH] fix: systemd build flag combinations

---
 pkgs/os-specific/linux/systemd/default.nix | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/pkgs/os-specific/linux/systemd/default.nix b/pkgs/os-specific/linux/systemd/default.nix
index 9cdc5dcd9d44..8f469714db92 100644
--- a/pkgs/os-specific/linux/systemd/default.nix
+++ b/pkgs/os-specific/linux/systemd/default.nix
@@ -562,7 +562,7 @@ stdenv.mkDerivation (finalAttrs: {
     (lib.mesonEnable "zlib" withCompression)
 
     # NSS
-    (lib.mesonEnable "nss-mymachines" withNss)
+    (lib.mesonEnable "nss-mymachines" (withNss && withMachined))
     (lib.mesonEnable "nss-resolve" withNss)
     (lib.mesonBool "nss-myhostname" withNss)
     (lib.mesonBool "nss-systemd" withNss)
@@ -574,7 +574,7 @@ stdenv.mkDerivation (finalAttrs: {
 
     # FIDO2
     (lib.mesonEnable "libfido2" withFido2)
-    (lib.mesonEnable "openssl" withFido2)
+    (lib.mesonEnable "openssl" (withHomed || withFido2 || withSysupdate))
 
     # Password Quality
     (lib.mesonEnable "pwquality" withPasswordQuality)
-- 
2.42.0

@blaggacao blaggacao force-pushed the feat/lower-squashfs-size branch 2 times, most recently from e7b659f to 9bd468e Compare May 10, 2024 16:31
@blaggacao
Copy link
Author

Green depends on: NixOS/nixpkgs#311675

@@ -74,6 +74,9 @@ if ! "$SCRIPT_DIR/kexec" --load "$SCRIPT_DIR/bzImage" \
exit 1
fi

sync; echo 3 > /proc/sys/vm/drop_caches
echo "current available memory: $(free -h | awk '/^Mem/ {print $7}')"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the linux kernel do this automatically anyway if it needs memory.

Copy link
Author

@blaggacao blaggacao May 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could very well imagine that yes, although I didn't verify and don't intent to stake that claim either.

The main reason I put this here is for reporting so that (especially during tests) we can see how much effective free memory was available prior to switching the kernel.

Not sure if it is utterly useful otherwise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. But you are already printing the available memory, which will automatically substract any memory claimed by page cache or dirty pages.
So no need to flush the caches, which might even contain the initrd that we are about to kexec into.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we could even print a warning if we are below X MB available memory.

Copy link
Author

@blaggacao blaggacao May 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we could even print a warning if we are below X MB available memory.

Yeah, I had that idea, too, today: outright refuse to proceed because it's potentially not recoverable, if you have no IPMI access.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. But you are already printing the available memory, which will automatically substract any memory claimed by page cache or dirty pages.
So no need to flush the caches, which might even contain the initrd that we are about to kexec into.

Iirc, I did some A/B testing and saw a small ~30MB difference, but yeah, I agree that it shouldn't be necessary.

I did notice, however, on a slightly tangential note, that at the RAM-limit, kexec --load kept working while then kexec -e failed. But I guess that's just due to what the stage 1 or squashfs ultimately command into RAM.

nix/installer.nix Outdated Show resolved Hide resolved
restore-network = pkgs.writers.writePython3 "restore-network" { flakeIgnore = [ "E501" ]; }
./restore_routes.py;

restore-network = pkgs.writers.writeBash "restore-network" ./restore_routes.sh;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the saving removing python out of interest?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember exactly. It was relatively significant though. I may even think in the ballpark of 150MB or so. It was, indeed, the lowest hanging fruit. I could have gone with python minimal, but since I was trying to really size this down as much as possible, I thought I'd save those additional dozens of MB that python minimal would have left us with, as well.

Copy link
Author

@blaggacao blaggacao May 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, the effect on the RAM was rather small (maybe 30-60MB or so?), ostensible due to a somewhat efficient caching and on-demand decompression of the squashfs, while having a (much!) smaller impact while compressed.

@blaggacao
Copy link
Author

blaggacao commented May 19, 2024

Closures

Baseline:

      util-linux = prev.util-linux.override {
        nlsSupport = false;
        ncursesSupport = false;
        systemdSupport = false;
        translateManpages = false;
      };

initrd 635.4 MiB (635.4 MiB)

w translateManpages

      util-linux = prev.util-linux.override {
        nlsSupport = false;
        ncursesSupport = false;
        systemdSupport = false;
        # translateManpages = false;
      };

initrd 635.4 MiB (635.4 MiB)

w systemdSupport

      util-linux = prev.util-linux.override {
        nlsSupport = false;
        ncursesSupport = false;
        # systemdSupport = false;
        # translateManpages = false;
      };

initrd 649.33 MiB (649.33 MiB)

Interestingly, bringing back systemd support, despite using systemd seems to have negative effect for other reasons than the pure dependency.

w ncursesSupport

      util-linux = prev.util-linux.override {
        nlsSupport = false;
        # ncursesSupport = false;
        # systemdSupport = false;
        # translateManpages = false;
      };

initrd 639.23 MiB (639.23 MiB)

original

      # util-linux = prev.util-linux.override {
        # nlsSupport = false;
        # ncursesSupport = false;
        # systemdSupport = false;
        # translateManpages = false;
      # };

initrd 651.07 MiB (651.07 MiB)

@blaggacao
Copy link
Author

NixOS/nixpkgs@e5b250b is still on staging and thus CI isn't yet green. It will probably take another short while before we can pull this in with the latest unstable.

nix/noninteractive.nix Outdated Show resolved Hide resolved
nix/noninteractive.nix Outdated Show resolved Hide resolved
nix/noninteractive.nix Outdated Show resolved Hide resolved
nix/no-grub.nix Outdated Show resolved Hide resolved
@Mic92
Copy link
Member

Mic92 commented May 19, 2024

We are now down to 1GB: https://github.com/nix-community/nixos-images/pull/218/files

Comment on lines +34 to +35
# save ~12MB by not bundling manpages
coreutils-full = prev.coreutils;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also question here: Do we actually care about this? We are not shipping manpages afaik and they are hopefully in a separate output?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are hopefully in a separate output?

Unfortunately not, at least in nixos-23.11:

  postInstall = optionalString (isCross && !minimal) ''
    rm $out/share/man/man1/*
    cp ${buildPackages.coreutils-full}/share/man/man1/* $out/share/man/man1
  ''
  # du: 8.7 M locale + 0.4 M man pages
  + optionalString minimal ''
    rm -r "$out/share"
  '';

minimal is false on -full

@@ -0,0 +1,121 @@
#!/usr/bin/env bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @Lassulus for review of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants