Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal for OCI v1.1 content source #87

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

devigned
Copy link

@devigned devigned commented Apr 4, 2023

This PR introduces a proposal to support OCI v1.1 as a content source in Warg. I'm opening this in draft status to start the discussion and iterate on feedback.

Todos:

  • Describe the code implementation in the Warg code base
  • Fill in sections on conclusions and test plan

For your viewing convenience: https://github.com/devigned/registry/blob/oci-source/proposals/20230316-oci-content-source.md

@devigned
Copy link
Author

devigned commented Apr 4, 2023

/cc @Kylebrown9 @lann @radu-matei @peterhuene

Copy link
Member

@radu-matei radu-matei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the proposal, @devigned!

I left a few comments, but it's great to see OCI registries as content source!

{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"artifactType": "application/vnd.wasm.component.bundled.v1",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason for taking a hard dependency on OCI v1.1?
The alternative would be to only rely on the config.mediaType for now, and not mandate the top-level artifactType, which would make it possible to push to OCI v1.0-compatible registries.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping someone would bring this up. I don't have a strong opinion. On one hand, I believe we should push for the most portable specification that fits our needs. On the other hand, I've received feedback that image manifests are not extremely specific and relying on the config.mediaType would be less expressive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: looking at the manifest example again, it is not a valid manifest for either v1 or v1.1. There are two options:

  • either mediaType: application/vnd.oci.artifact.manifest.v1+json and artifactType: application/vnd.wasm.component.bundled.v1 , and a list of blobs — OCI v1.1, not supported everywhere yet, but the better long term solution
    OR
  • setting config.mediaType and a list of layers (effectively ORAS) — which still has a top-level media type of a container image, but is accepted by most registries.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, I'm misunderstanding the current work to remove artifact.md opencontainers/image-spec#999, and the guidance for artifacts in opencontainers/image-spec#1043, but it seems like artifact manifests will not be a thing in v1.1.

I tried to model this based on opencontainers/image-spec#1043 section on Guidelines for Artifact Usage

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct — if artifact manifests are not a thing in 1.1, then option 2 above is the only way to make this work for now (i.e. artifactType will not exist).

Copy link

@AaronFriel AaronFriel Apr 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n.b.: Setting artifactType is supported in OCI 1.0 and registries are out of conformance if they error on unknown fields. You can use both, and you can also set a scratch media type for config.mediaType if it's irrelevant to you.

I think Option 2 makes the most sense, using a scratch mediatype for config.mediaType - edit: I see you have defined your own media type for a config blob. That works as well!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's not a significant value add to having an artifactType when there is a dedicated config.mediaType. You'll find some registries have compatibility issues. And a lot of tooling hasn't been updated to read this yet.

"env1": "first",
"env2": "second"
},
"files": [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this assume WASI in the runtime environment?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does. Perhaps, it shouldn't. What do you suggest?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are able to reference static files as data sections in the component (see component below), then there is not need for WASI to be required to access files that are part of the component.

Now, short term, we might have to, but I want to make sure this is not mandating all environments that run components should always allow WASI.

{
"mediaType": "application/vnd.wasm.content.layer.v1+wasm",
"digest": "sha256:2e94e0582fb925e89515435513496819dc8f364f2da400059a64d6d1412ca2ad",
"size": 2087464,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the assumption here that the bundled component is a statically linked component that includes all composing components and core modules?

This means there is no way to do layer de-duplication when pushing, and the registry would have to store the bytes for the same components for every component that takes a certain component as a dependency.

I'd like to propose a transparent way for clients to split bundled component at distribution time, and reassemble it when pulling the artifact.
This is described in this proposal — https://hackmd.io/50rfwV6BTJWN8VZBhdAN_g

And a prototype for such a tool can be found here — https://github.com/fermyon/wasm-splice

We could take this approach for nested components, core modules, and data sections.

## Proposal
### User Stories
#### Story 1 - Publishing a Component
Alex is an engineer working in a large organization which is building applications using Wasm components. Alex would like share the component that their team has built with others within their company. The company Alex works for has a lot of folks with experience running containers, and they have OCI registries already provisioned to store container images. Alex would like to publish the component their team has built into one of their company's OCI registry.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unclear about what tools are supported for publishing and fetching components from the OCI registry. Can Alex and Erin use their familiar commands like oras push and oras pull to achieve this?

#### Story 3 - Publishing a Bundled Component
Alex is an engineer working in a large, security focused organization which runs a lot of Linux containers in production. The security team at Alex's company requires Linux containers be signed and provide a software bill of materials. Alex has recently built a new application that instead of being packaged as a Linux container image, they have built their application targeting Wasm. In fact, Alex built their application using many Wasm components. Alex and their team have finalized the feature set for their first release, tested the application, and locked the version for all the dependencies. Alex would like to publish this version of their application with all the application dependencies bundled together. Alex would also like to sign the bundled application and includes a software bill of materials of the components bundled.

#### Story 4 - Fetching a Component

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a story for how these components are eventually run? How are they to be consumed after they are pulled down?

Is there a chance that there are platform specifics bits that are required to run the modules (I was looking at the image for slight which seems to need certs installed along side the module)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created an issue here: deislabs/containerd-wasm-shims#89

@codefromthecrypt
Copy link

Summary comment is I think that we should be very precise in language and focus on wasm modules, not wasm components.

Remember component model is phase 1 in w3c. This is literally the first phase and by definition isn't stable for implementation.

Meanwhile, compilers build wasm modules, and places where wasi is mentioned, it should be about wasip1 specifically at this point in history and implementation.

If this was a naming glitch, I would replace the word "component" with "module" and try to align language with the webassembly spec, no later than the 2.0draft. If this wasn't a glitch, and this is really supposed to be about "component" like component model, I feel the cart is sincerely in front of the horse.

My 2p

@tschneidereit
Copy link
Member

@codefromthecrypt the entire registry architecture is designed for components specifically, not core wasm modules. Components give us a lot of properties that core modules don't have—most importantly a language- and runtime-agnostic way to pass high-level data with Interface Types, shared-nothing linking, and generally full isolation guarantees.

The charter for the BA's SIG-Registries spells this out explicitly, too.

@codefromthecrypt
Copy link

OK I suppose maybe there should be an issue like this in the WebAssembly repo, as there's lack of coherence in core wasm modules, and they certainly will be around a while. Meanwhile, agree this is the bytecodealliance (not w3c) repository and can certainly decide to only solve component model!

```

#### Image Manifest for Signing and SBOMs
The following example illustrates signing a component using Notary V2. Use of Notary V2 could be replaced with any other signing implementation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the value of picking a specific signing solution for this spec. Is there something that makes this content need a specific signing solution? If not, I'd cut the section and just say image signing is recommended, or say nothing at all since signing is separate from the content being signed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just using it as an example. It was not intended to imply any favor to one signing solution vs another.

{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"artifactType": "application/vnd.wasm.component.bundled.v1",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's not a significant value add to having an artifactType when there is a dedicated config.mediaType. You'll find some registries have compatibility issues. And a lot of tooling hasn't been updated to read this yet.

Copy link

@lukewagner lukewagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work writing all this up! A few comments:

Comment on lines +128 to +135
The following is an example of the configuration structure referenced in the preceding image manifest.
```json
{
"mediaType": "application/vnd.wasm.component.config.v1+json",
"architecture": "wasm32",
"os": "wasi"
}
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this configuration file is necessary; it might be redundant:

  • While wasm32 shows up in the compiler target-triple, that is just to configure code generation. Given a .wasm binary, there is no distinction to be made, it has one fixed meaning defined by the spec and whether it contains 32- or 64-bit memories and instructions operating thereupon is just an internal impl detail.
  • I think the way we should think about "os" is: what are the interfaces imported by this .wasm. If you see wasi_snapshot_preview1, well, that's Preview 1. Starting with Preview 2, there isn't one monolithic "WASI" at all; you'll see imports of wasi:http/outgoing-handler or wasi:filesystem/types etc and so your runtime either supports or doesn't support those individual interfaces (and perhaps we want to have some sort of OCI Runtime spec enumerating the set of interface ids guaranteed to be present?). However, there should be no added information saying that the os is wasi.

Comment on lines +137 to +138
#### Image Manifest for Interface Components
The following is an example image manifest for a component containing a configuration structure and a layer containing the `my-component-interface.wasm` binary.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, in our most-recent discussions, we were realizing that there's not really a need to distinguish between "component packages" and "interface packages". Both are represented as components, and an "interface package" is just a component that exports types (that represent Wit interfaces and worlds). But from a registry perspective, I don't have to care: when I see an interface identifier foo:bar/baz, I find the package foo:bar which must resolve to a component, and then I look for an export named baz which must resolve to a type representing a Wit interface and, if either of those aren't the case, foo:bar/baz is not a valid interface identifier. Thus, I don't think we'll actually need a separate artifactType here; we can simply publish "components" and client tooling can do what it needs to.

}
```

#### Image Manifest for Bundled Components

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we take "bundled component" to mean "a component with its component-dependencies replaced with inline components" (as defined above), then this artifact type is not a bundled component -- it's a component combined with environment variable configuration and static assets: that's not a component at all, it's something bigger that contains a component, so think we should give it a different name. Because this bigger thing is no longer composable the way components are composable, I'd suggest calling it a "wasm app" (or something that indicates that it's the final product that you can deploy), and not including "component" in the name at all (that's an impl detail of the app, and the app could just as well contain a core wasm module as, I believe, it's doing today).

The following is an example of the configuration structure referenced in the preceding image manifest.
```json
{
"mediaType": "application/vnd.wasm.component.config.v1+json",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expanding on what I was saying above, I don't think we should consider this file "component configuration", but rather some sort of "wasm app" configuration. Once wasi-virt is built and working, I think there shouldn't even need to be a concept of "component configuration": the contents of this configuration file should go into the component: files go into data segments in a component generated by wasi-virt that bundles the original component (so that the new outer component does not import wasi:filesystem/types at all). Similarly, environment variables can be set by having wasi-virt virtualize wasi:cli/environment. The large win from doing it this way is that the output of wasi-virt is still a component that can be further composed and manipulated by downstream component tooling (enabling virtual platform layering). In the meantime, it makes total sense to use this config file as a stopgap (or maybe in perpetuity if folks are wanting to do other app-level configuration stuff, which I've heard); I just want to name it appropriately so that "component" really does mean "component" (and nothing else).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants