Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: introducing a nerdctl error package #3393

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

apostasie
Copy link
Contributor

IMHO, we do have a problem with errors management in the codebase.

Quite often, they are confusing, hard to decipher.
See for example #3302 or #3197.

This is a hindrance for users, as some of these messages are frankly misleading, or generally not informative enough on what is the actual problem by flashing a symptom instead (example: "tcp connection failure" is a symptom, the actual problem may be that "your DNS server is unreachable").

This is also a PITA for us, as we are quite often left reading the tea leaves in bug reports, squinting at some cryptic error which origin is hard to pinpoint.

And finally, this prevents us from (integration) testing error reporting to the user, as the only thing we could do is duplicating the string in tests if we want to validate the output.

This is not just circumstantial - it is a systemic issue:

  1. most of the time we fmt.Errorf, which makes it impossible for consuming code to test the actual condition with errors.Is.
  2. we do not wrap errors in a very mindful way - "invalid argument" is fine - but where is it coming from? a filestore? cobra argument validation?

Ability to test errors (errors.Is(X)) would allow consuming code:

  • to specialize behavior as certain errors (or errors categories) may be recoverable, or retry-able
  • decide what to tell the user - which is not necessarily the same thing as what the error text says
  • provide localization

And wrapping errors into meaningful categories would still keep things simple enough for consuming code.

While some of my recent PRs have introduced better erroring in places, I think it is time to propose the introduction of a nerdctl "errors (categories)" package, and encourage the use of var errors and errors joining.

This is this PR, which defines a handful of high-level errors that code can leverage to wrap errors.
Maybe there is more of these categories, or maybe we want to massage the ones proposed here. Feedback welcome as usual, and this may of course evolve as we strengthen the codebase.

Note that I have been wondering if we should just leverage containerd/errdefs instead of creating a new subpackage.
I am now leaning towards "no", for a few reasons:

  • it is confusing to return containerd errors, when what we want to express is a nerdctl error condition - we should handle containerd errors
  • containerd has its own set of concepts and expectations, which likely does not match what we need
  • containerd/errdefs is likely missing some of the error categories we need, so, we might end-up with two packages, leading to more confusion
  • we will not have flexibility to change containerd/errdefs as we see fit

This new nerdctl subpackage is NOT meant to collect and hold every and all errors we are going to define. Each package should still define their own errors. These here are meant to provide categories that will wrap the errors introduced by independent packages.

Sorry for the long explanation here, but I thought this required some extended explanation.

Let me know your thoughts.

Cheers.

pkg/errs/errors.go Outdated Show resolved Hide resolved
Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to keep errors.Is(err, containerderrdefs.ErrNotFound) functional (including cases where a function in nerdctl/pkg is called by a thirdparty project)

Signed-off-by: apostasie <[email protected]>
@AkihiroSuda AkihiroSuda requested review from dmcgowan and a team September 2, 2024 16:02
@apostasie
Copy link
Contributor Author

We have to keep errors.Is(err, containerderrdefs.ErrNotFound) functional (including cases where a function in nerdctl/pkg is called by a thirdparty project)

Done.

"github.com/containerd/errdefs"
)

var (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of these are already defined in https://github.com/containerd/errdefs/blob/02b65bcc1b50358f7c1cca419c76f538349af784/errors.go#L37

ErrInvalidArgument, ErrFailedPrecondition.

And ErrSystemIsBroken, ErrNetworkCondition, ErrServerIsMisbehaving are similar to ErrInternal and ErrUnavailable.

Can we reuse the definition in containerd/errdefs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can alias them if you think it is helpful, but I think it is a bad idea to just use them directly as outlined above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think it's a bad idea, maybe just wrap it with %w so it works for both nerdctl's and containerd's definition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think it's a bad idea, maybe just wrap it with %w so it works for both nerdctl's and containerd's definition.

Just to clarify: containerd returning errors from errdefs are fine - and may of course be handled and wrapped, and should NOT be removed, or ignored - and none of this proposal will "break" existing code testing for errdefs.ErrNotFound.

example:

if errors.Is(containerdErr, errdefs.ErrNotFound) {
    return errors.Join(ErrVolumeRemovalFail, containerdErr)
}

Same applies to any other error - from any third-party lib, or from golang.

@apostasie
Copy link
Contributor Author

Clarifying intent with a clear example (untested, pseudo-code):

Current situation - imagine some nerdctl code package doing:

if err := os.Stat("somefile"); err != nil {
	return fmt.Errorf("something not right with filesystem: %w", err)
}

Returned error can only be tested by consuming code by testing against low-level golang errors (eg: os.ErrNotExists) and said consuming code cannot assess whether it is fatal (broken system) or something survivable without actual knowledge of what is going on inside the module.

Suggestion:

var ErrNotFound = errors.New("resource cannot be found")
var ErrSystemBroken = errors.New("something is wrong with your system")
if err := os.Stat("somefile"); err != nil {
	if errors.Is(err, os.ErrNotExists){
		return errors.Join(errs.ErrNotFound, err)
	}
	return errors.Join(errs.ErrSystemBroken, err)
}

Now, consuming code can test against a small set of meaningful nerdctl errors that do explain what is the issue instead of flashing a low-level symptom.

As to why we should not use the existing errdefs IMHO - see reasons above, but chiefly, errdefs does not have all the errors we need (as obviously some of them do not make sense), so, consuming code would have to be instructed to tests against errdefs in some conditions, and ALSO to tests against our own errs package in other conditions.
This is confusing and not a really good API...

As I said, aliasing would be fine of course (I do not quite care, as long as consuming code can just test against nerdclt/errs/ErrXXX) consistently.

@dmcgowan
Copy link
Member

dmcgowan commented Sep 2, 2024

containerd/errdefs is likely missing some of the error categories we need, so, we might end-up with two packages, leading to more confusion

The categories defined by errdefs are intentionally constrained and based on grpc error handling, best summarized in https://cloud.google.com/apis/design/errors. We choose this for consistency across packages which may have implementations on either side of grpc. While the error handling does not need to exactly map to containerd or be what gets presented to end users, there is value in having a consistent way to represent errors in the code, whether those errors come from internal packages or from containerd. I would say what you are describing is not missing categories, but the ability to attach error useful details around those categories. In errdefs we are also trying to figure out ways to make sure we can attach more useful details to errors and make sure those get transferred to the client, such as ensuring that parse errors and network errors are more informative and actionable to end users. This will likely be done more by defining rich error types or wrapping existing types rather than creating new base types with errors.New.

@apostasie
Copy link
Contributor Author

apostasie commented Sep 2, 2024

Thanks @dmcgowan .

So, coming back to the original problem then:

What we have right now is code that just rewrap system level or third party errors withfmt.Errorf, making it impossible for embedders to actually "test" error conditions, or provide meaningful feedback to users.

What is the suggested, concrete solution for that problem, here in nerdctl, that we could implement now?

(I appreciate that containerd itself may come up with something generic in the future, but it is not there right now I guess?)

@apostasie
Copy link
Contributor Author

Bump

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants