Skip to content

Conversation

@renner0e
Copy link

No description provided.

The legacy rechunker is basically unmaintained, this is a much safer
option to reduce our (delta) image size.

https://coreos.github.io/rpm-ostree/build-chunked-oci/

https://gitlab.com/fedora/bootc/base-images/-/blob/main/bootc-base-imagectl
We need a newer podman version than ubuntu 24.04 provides,
homebrew is the most elegant solution.

We need the layer annotations for the rpm-ostree rechunker so that
updates are more efficient.

workaround for:
containers/podman#27796
The metadata from a prior built image is used to construct the new
one so that the difference between them is minimized if possible.
Theoretically way more layers possible (I tested with 2000+) but you
can't podman run the image anymore with 500+

I think the limit used to be 255, no longer the case with Kernel 6.15+.

We want as much layers as we can get away with because it makes
rechunking much more efficient.

See: https://github.com/containers/storage/blob/83cf57466529353aced8f1803f2302698e0b5cb7/drivers/overlay/overlay.go#L56-L81
This reverts commit e9dd3f8.

Seemingly random failures I saw on nvidia iamges, let's see if this is
the root cause
@gemini-code-assist
Copy link

Summary of Changes

Hello @renner0e, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the image rechunking mechanism by transitioning from a custom, multi-stage container-based process to using the integrated rpm-ostree bootc-base-imagectl rechunk command. This change streamlines the build system and aligns it with upstream rpm-ostree practices. Additionally, it introduces a crucial system-level fix to prevent boot issues for users migrating from older images that relied on a legacy rechunker and nss-altfiles for group management.

Highlights

  • Rechunking Process Migration: The image rechunking process has been entirely migrated from a custom container-based approach to leveraging rpm-ostree's native bootc-base-imagectl rechunk command, simplifying the build system.
  • Legacy Rechunker Removal: The 'rechunker_image' variable and the complex multi-step rechunking logic in the Justfile have been removed, streamlining the build process significantly.
  • Critical Group Fix for Rebasing: A new 'rechunker-group-fix' script and systemd service have been introduced to address critical group management issues that could arise when users rebase from images built with the legacy rechunker using nss-altfiles, preventing potential system boot failures.
  • Layer Definition Script: A new script '20-layer-definitions.sh' was added to prepare for defining custom layer properties for rpm-ostree rechunking, although its primary functionality is currently commented out.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/reusable-build.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the image rechunking process by transitioning from a custom legacy rechunker to the official bootc-base-imagectl tool. Key changes include removing the custom rechunker image, simplifying the rechunk Justfile recipe to use bootc-base-imagectl, and deleting the load-rechunk recipe. Additionally, a new 20-layer-definitions.sh script was added (though mostly commented out) to prepare filesystem properties for rechunking, and a rechunker-group-fix script with an accompanying systemd service was introduced to resolve issues with group files when rebasing images that previously utilized nss-altfiles with the legacy rechunker. Review feedback points out that the podman image scp command in the Justfile is missing a destination and SUDOIF prefix, criticizes the rechunker-group-fix script for its complex and potentially flawed logic regarding group file manipulation, and notes a superfluous echo "::endgroup::" in the 20-layer-definitions.sh script that should be removed.

ID=$(${PODMAN} inspect --format={{ '{{.Digest}}' }} localhost/"${image_name}":"${tag}")
ID_ROOT=$(sudo ${PODMAN} inspect --format={{ '{{.Digest}}' }} localhost/"${image_name}":"${tag}")
if [[ ! "${PODMAN}" =~ "docker" ]] && [[ -n "$ID" ]] && [[ "$ID" != "$ID_ROOT" ]]; then
${PODMAN} image scp $(whoami)@localhost::localhost/"${image_name}":"${tag}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This podman image scp command has two issues:

  1. It's missing the destination to copy the image to. It should target the rootful store, for example root@localhost::.
  2. It's missing ${SUDOIF}. Copying to the rootful store requires root privileges.

Without these changes, the image won't be available in the rootful context for the next rechunking step.

        ${SUDOIF} ${PODMAN} image scp localhost/"${image_name}":"${tag}" root@localhost::

Comment on lines +8 to +21
for manually_created_user in $(grep -E -e ".*:[1-3][[:digit:]]{3}:.*" "$1") ; do
# `grep` matched on a group with GID [1-3]000 or so
if [ ! -z "$(cut -f4 -d: <<< "${manually_created_user}")" ] ; then
continue
fi
IGNORE_MANUAL="$(cut -f1,3 -d: --output-delimiter="|" <<< "${manually_created_user}")|${IGNORE_MANUAL:-}"
for related_group in $(grep "$(cut -f1 -d: <<< ${manually_created_user})" "$1"); do
# Deduplicates matches for the same group/user
if [ "$(cut -f1 <<< "${related_group}")" == "$(cut -f1 <<< "${manually_created_user}")" ] ; then
continue
fi
IGNORE_MANUAL="$(cut -f1 -d: <<< "${related_group}")|${IGNORE_MANUAL:-}"
done
done

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This block of code is critical for system stability but is very difficult to understand, which increases the risk of bugs. I have a few concerns:

  1. Misleading Variable Name: The variable manually_created_user actually holds a line from the group file. Renaming it to group_line would improve clarity.
  2. Questionable Logic: The condition on line 10 (if [ ! -z "$(cut -f4...)" ]) causes the loop to skip any group that has members. This seems overly restrictive and might miss groups that should be preserved.
  3. Flawed Inner Loop: The inner loop starting on line 14 appears to have flawed logic. It greps for the group name as a substring, which is unreliable, and then explicitly skips exact matches. This means it will only act on partial/substring matches, which is probably not the intended behavior.

Given the high impact of this script, I strongly recommend refactoring this section to be more straightforward and robust. Adding comments to explain each step of the logic would also be very helpful for future maintenance.

#echo "listing $COUNT biggest ones"
#jq -r --arg n "$COUNT" ' sort_by(.size) | reverse | .[0:($n | tonumber)] | .[] | "\(.size)\t\(.path)" ' "$OUTPUT" | numfmt --to=iec --field=1

echo "::endgroup::"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This echo "::endgroup::" is the only executable line in the script and appears to be a leftover from a copy-paste. It will prematurely close a log group in the GitHub Actions output, which can make debugging harder. Since the rest of the script is commented out, this line should be removed.

@renner0e renner0e merged commit 4162926 into main Dec 28, 2025
18 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants