Skip to content

Conversation

@ezopezo
Copy link

@ezopezo ezopezo commented Dec 1, 2025

This adds support for preserving and labeling intermediate stage images in multi-stage builds. In contrast to the --layers flag, --cache-stages preserves only the final image from each named stage (FROM ... AS name), not every instruction layer. This also keeps the final image's layer count unchanged compared to a regular build.

New flags:

  • --cache-stages: preserve intermediate stage images instead of removing them
  • --stage-labels: add metadata labels to intermediate stage images (stage name, base image, build ID, parent stage name). Requires --cache-stages.
  • --build-id-file: write unique build ID (UUID) to file for easier identification and grouping of intermediate images from a single build. Requires --stage-labels.

The implementation also includes:

  • Detection of transitive alias patterns (stage using another intermediate stage as base)
  • Validation that --stage-labels requires --cache-stages
  • Validation that --build-id-file requires --stage-labels
  • Test coverage (15 tests) and documentation updates

What type of PR is this?

/kind feature

What this PR does / why we need it:

General use: This functionality is useful for identification, debugging, and reusing intermediate stage images in multi-stage builds.

Specific need: Identifying the content copied from intermediate stages in multi-stage builds into the final image is a hard requirement for supporting Contextual SBOM - an SBOM that understands the origin of each component.
While intermediate images can be extracted using the --layers option, this approach has several issues for our use case:

  • Intermediate stage images are unlabeled, making it difficult to determine which image corresponds to which build stage - especially when the Containerfile reuses the same pullspec across multiple stages.
  • All instructions from all intermediate stages appear in the cache (visible via buildah images --all), which introduces unnecessary noise for our purposes.
  • rootfs.diff_ids are not squashed in final stage: the final-stage image ends up containing diff IDs for every instruction in the final stage. However, we need the final build image to resemble a regular build (without --layers), meaning:
    • it should contain the diff IDs inherited from the base image, and
    • exactly one diff ID representing the squashed final-stage instructions.

Related repositories:
konflux (uses mobster for SBOM generation),
mobster (implements contextual SBOM functionality requiring this change),
capo (wraps builder content identification functionality for mobster),
Contact person: emravec (RedHat) / @ezopezo (Github)

How to verify it

Run any multistage build with intermediate stage specified with implemented arguments. Resulting intermediate images should be correctly labeled. Example:
buildah build --cache-stages --stage-labels --build-id-file ./file.txt -t test:0.1 .

Which issue(s) this PR fixes:

Fixes: #6257
Internal Jira: https://issues.redhat.com/browse/ISV-6122

Does this PR introduce a user-facing change?

Add `--cache-stages`, `--stage-labels`, and `--build-id-file` flags for preserving and labeling intermediate stage images in multi-stage builds.

@openshift-ci openshift-ci bot added the kind/feature Categorizes issue or PR as related to a new feature. label Dec 1, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 1, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ezopezo
Once this PR has been reviewed and has the lgtm label, please assign luap99 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 1, 2025
@packit-as-a-service
Copy link

Ephemeral COPR build failed. @containers/packit-build please check.

@ezopezo ezopezo force-pushed the emravec/preserve-intermediate-images branch from 59cd9ae to 6bd3187 Compare December 1, 2025 14:24
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 1, 2025
@ezopezo ezopezo force-pushed the emravec/preserve-intermediate-images branch 2 times, most recently from b7e81df to 3314051 Compare December 2, 2025 16:30
@ezopezo
Copy link
Author

ezopezo commented Dec 2, 2025

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 2, 2025

@ezopezo: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ezopezo
Copy link
Author

ezopezo commented Dec 2, 2025

@nalind can you please take a look and put ok-to-test label? It seems to me that tests are failing most likely with some timeouts and thus I would like to try to re-run them (or please tell me what I just broke :) ).

@nalind
Copy link
Member

nalind commented Dec 2, 2025

/ok-to-test

@ezopezo
Copy link
Author

ezopezo commented Dec 3, 2025

/test

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 3, 2025

@ezopezo: No presubmit jobs available for containers/buildah@main

In response to this:

/test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

This adds support for preserving and labeling intermediate stage images
in multi-stage builds. In contrast to the --layers flag, --cache-stages
preserves only the final image from each named stage (FROM ... AS name),
not every instruction layer. This also keeps the final image's layer count
unchanged compared to a regular build.

New flags:
 - --cache-stages: preserve intermediate stage images instead of removing them
 - --stage-labels: add metadata labels to intermediate stage images (stage name,
   base image, build ID, parent stage name). Requires --cache-stages.
 - --build-id-file: write unique build ID (UUID) to file for easier
   identification and grouping of intermediate images from a single build.
   Requires --stage-labels.

The implementation also includes:
 - Detection of transitive alias patterns (stage using another intermediate
   stage as base)
 - Validation that --stage-labels requires --cache-stages
 - Validation that --build-id-file requires --stage-labels
 - Test coverage (15 tests) and documentation updates

This functionality is useful for debugging, exploring, and reusing
intermediate stage images in multi-stage builds.

Signed-off-by: Erik Mravec <[email protected]>
@ezopezo ezopezo force-pushed the emravec/preserve-intermediate-images branch from 3314051 to 3955b20 Compare December 3, 2025 09:25
@ezopezo
Copy link
Author

ezopezo commented Dec 4, 2025

@nalind @mtrmac @TomSweeneyRedHat can you please take a look on this? (or pick up some appropriate reviewers?) Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/feature Categorizes issue or PR as related to a new feature. ok-to-test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Buildah supports selective layer (rootfs.diff_ids) squashing with intermediate image retention in multistage builds

3 participants