-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
many: deal with incorrect status from osbuild better #827
base: main
Are you sure you want to change the base?
Conversation
This commit changes the progress parser (again) to deal with errors from the osbuild json progress scanner. On errors we will now exit right away and potentially kill osbuild but provide an error message that hints how to workaround the issue. The original code assumed we get transient errors like json decode issues. However it turns out that this is not always the case, some errors from a bufio.Scanner are unrecoverable (like ErrTooBig) and trying again just leads to an endless loop. We can also not "break" wait for the build to finish because that would appear like the progress is broken forever and we would still have to report an error (just much later).
This commit ensures that we do restrict the memory of the bib test conatiner to catch excessive memory usage. This is prompted by a memory leak when dealing with unrecoverable status messages that lead to failures in konflux.
defer func() { | ||
// ensure osbuild is stopped even if we exit early | ||
if cmd.Process != nil { | ||
cmd.Process.Kill() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may need a bit more care, maybe only do it in hte error case (as the happy case does a cmd.Wait already so no killing needed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fine as is but I would strongly recommend using PR_SET_PDEATHSIG if it's not already in use. That's a kernel-enforced way to lifecycle bind the child to the parent, such that if e.g. the parent dies due to SIGSEGV or whatever, the child gets killed too.
This commit changes the centos-9 test to run with `podman -t` so that we have a test-case that uses the `terminal` progress. This is prompted by: a) we have no integration test currently that uses the terminal progress for the full build b) a konflux failure/memory leak that showed because there the test is run with `-t`
// can also not (in the general case) recover as | ||
// the underlying osbuildStatus.scanner maybe in | ||
// an unrecoverable state (like ErrTooBig). | ||
return fmt.Errorf(`error parsing osbuild status, please eport a bug and try with "--progress=verbose": %w`, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just eporting a typo
s/eport/report/
This commit updates the images library to pull in the fix [0] for the overly long messages from osbuild. This should test the (previously) failing test that runs the centos-9 ISO with an attached terminal. [0] osbuild/images#1200
go.mod: upadate to latest
images
to get PR#1200This commit updates the images library to pull in the fix [0] for
the overly long messages from osbuild. This should test the
(previously) failing test that runs the centos-9 ISO with an
attached terminal.
[0] osbuild/images#1200
test: run the centos-9 test with an attached terminal
This commit changes the centos-9 test to run with
podman -t
sothat we have a test-case that uses the
terminal
progress. Thisis prompted by:
a) we have no integration test currently that uses the terminal
progress for the full build
b) a konflux failure/memory leak that showed because there the
test is run with
-t
This commit ensures that we do restrict the memory of the bib
test conatiner to catch excessive memory usage. This is prompted
by a memory leak when dealing with unrecoverable status messages
that lead to failures in konflux.
This commit changes the progress parser (again) to deal with
errors from the osbuild json progress scanner. On errors we
will now exit right away and potentially kill osbuild but
provide an error message that hints how to workaround the
issue.
The original code assumed we get transient errors like json
decode issues. However it turns out that this is not always
the case, some errors from a bufio.Scanner are unrecoverable
(like ErrTooBig) and trying again just leads to an endless
loop. We can also not "break" wait for the build to finish
because that would appear like the progress is broken
forever and we would still have to report an error (just
much later).