RFC: Component Model for Miden #1171

bitwalker · 2023-12-03T17:31:12Z

bitwalker
Dec 3, 2023
Collaborator

This document describes an initial draft of a plan for how to bring the key features of the WebAssembly (Wasm) Component Model to the Miden toolchain, and by doing so, solve a number of complex problems that we are either currently facing, or have not even considered yet. Current plans (regardless of this proposal) are for the Rust toolchain, currently in the initial stages of development, as well as the Wasm frontend of the compiler, to produce/consume the "real" Wasm component model, for a variety of reasons. However, the backend of the compiler, and the rest of the Miden toolchain would, if this proposal is accepted (in whole or part), adopt a Miden component model. This would be based on the Wasm component model, but modified to the needs and unique constraints of the Miden VM.

NOTE: This is more of an umbrella proposal, it's not all-or-nothing, and it may be that some elements of this aren't a good fit and others are. I'd like to keep the overall ideas intact, and am less concerned about specific implementation details, but definitely raise any questions/issues you have. I'm only passing familiar with some of the VM internals, and even though I've tried to at least ensure various parts of this proposal are in theory possible to implement, there are probably many things I haven't thought of, and some which I may have considered, but missed something key when digging around.

Lastly, the proposal here is intended to be implemented in phases, and without causing any significant disruption to our roadmap. My intent would be to have the compiler team take on the bulk of the work, since the majority of it would be in the compiler and assembler, but a few of the proposed changes would require working closely with the VM team to implement.

Background

This proposal came about over the last week as Denys and I have been working through the details of what is currently called the "Miden SDK", i.e. the Rust frontend for Miden. This SDK consists of two parts:

A Cargo extension that is used to create new Rust projects targeting Miden, and compile them to Miden Assembly.
One or more Rust crates which expose various bits of Miden "natively" in Rust. We expect this to be two crates initially, one that is oriented towards "vanilla" Miden VM use cases, and one that is oriented towards the rollup.

Compiling a Rust program to Miden Assembly involves both rustc and midenc, by first generating a Wasm module via rustc, and then compiling that to Miden Assembly via midenc.

That all sounds relatively straightforward, but there are a number of rather sticky challenges hiding here:

Rust assumes that the entire program being compiled is running in the same address space. It therefore doesn't take any measures to avoid doing something that would be obviously bad in Miden, such as inlining the callee of a call instruction into the caller. We have to come up with clever ways to protect against that sort of thing ourselves with proc macros and such, which quickly becomes a mess.
The way Rust links programs makes it difficult to use dependencies without pulling in the code of that dependency into your own crate. For example, compiling a crate that represents a note script, which interacts with another crate containing some account the note interacts with, would naively link against, and pull in, the code from the account crate which is referenced, resulting in a fundamentally broken note script. There are ways around this, using extern declarations and such, but this is easy to get wrong, especially for end users, and the only reasonable solution is, again, proc macros.
Once a Rust program is compiled to Wasm, we lose virtually all type information, everything is expressed in terms of i32 or i64. This prevents us from being able to handle calling conventions for these modules in midenc, and also results in worse code generation, because we are unable to elide code required for type assertions in many cases.
We have no common ABI in Miden. The compiler does use a well-defined ABI for code that it compiles, but that is only useful for code generated by midenc. Hand-written MASM functions currently use an ad-hoc ABI, at best, and other compiler projects are likely to use their own in lieu of something canonical. This makes interop painful, if not outright impossible in some cases.
It turns out that Miden Assembly is an awkward assembly language to target, but not for the reasons you'd expect:
- In an effort to be more friendly to write by hand, it made it difficult for a compiler to generate. In particular, it introduces its own module system, which conflicts with the way languages such as Rust compile to assembly. For example, Rust (amongst others) compile all of the modules of a crate into a single flat namespace, containing only function and global symbols, some of which are defined, and some of which are undefined and expected to be provided with definitions before use. The job of gathering one or more of these artifacts, linking all the symbols, and producing the final executable, is handled by the linker. One consequence of the MASM module system, is that it works nicely for consuming hand-written MASM code, but doesn't work at all for consuming compiler-generated MASM, because the original module structure of the high-level language is lost.
- The limitation on symbol names (and the requirements on library namespaces/module names) is really really strict. Rust's mangling scheme for example, easily blows past the maximum length allowed by the assembler. Such schemes often use characters other than A-Za-z0-9_ , and there are plenty of languages that use kebab-case symbols, or allow symbols like ! or ?. This is obviously fixable, but is an example of an area where assumptions were made in MASM that really only hold for hand-written MASM. Aside from removing the limit on length, we could allow a much broader set of characters in symbol names by using quoted identifiers for those that don't adhere to the limited set used by hand-written MASM, e.g. exec."some-module::is-valid?". But that's a separate discussion.
- There are no global variables or data segments. These are primitives provided by all other assemblers I'm familiar with. I think this is largely due to not needing them in the MASM that has been written by hand so far; and part of it is due to the issues I raised in Istanbul around initialization of memory. Regardless, this is still a pressing issue for which I have some short-term hacks in mind, but no concrete solution for.
We have no clear picture for how developers will share code with each other, even just in Rust, let alone across languages or compilers. We know roughly what kind of metadata we'll need about compiled libraries and programs, but the format of that metadata, how code and metadata are packaged and distributed, and how that interacts with the native language tooling is completely undetermined.

There are likely a few other things that I'm forgetting to mention at the moment, but I think in general the problems as I see it can be categorized in three major ways:

Things we haven't gotten to yet (such as developer experience, deployment, etc.)
Ad-hoc implementation details (conventions around hand-written MASM, binary/textual formats, packaging)
Frontend-specific issues (Rust linking behavior and dependency management, memory model, etc.)

Towards the end of the week last week, Denys and I started investigating how we might solve some of the frontend-specific issues, as well as some of the future roadmap issues, via the Wasm Component Model. It quickly became apparent that not only would it address some of our immediate blockers, but could also be applied to a number of these other issues.

Component Model?

I'm sure by now, you've either checked out the link I dropped at the top, or been going crazy waiting for me to explain what the hell a component model is, and what it has to do with anything I've been talking about so far. This section is for you! Hopefully this will provide enough context to orient you, without drowning you in details, but I'm happy to answer questions directly too.

NOTE: I'm going to describe this in terms of WebAssembly, because I want to avoid conflating the current state of Miden Assembly (and the VM) with the terminology here. However, I will describe how all of this maps to Miden Assembly in later sections, so consider this section as providing essential context: what exactly is the Component Model, what problems it solves, etc. Our own implementation will likely differ in various ways, but by understanding how the Component Model was introduced to the WebAssembly ecosystem, it will better illustrate how we do so for Miden.

WebAssembly Refresher

WebAssembly (wasm32) is a Harvard-architecture machine (i.e., like Miden, code and data are in separate address spaces, unlike a Von-Neumann architecture in which they are in the same address space). Also like Miden, it is a stack machine, with a relatively tiny instruction set. It is a 32-bit machine (though a wasm64 variant is in the works), but with a native 64-bit integer type. Memory is byte-addressable, and represented as a flat, linear address space, limited to 4GB of memory. It was originally designed for use in web browsers (hence the name), but has since evolved for use cases outside the web. It is used for edge computing, sandboxing, as an orchestration runtime (e.g. Kubernetes can run Wasm programs natively as if they were containers), and has seen use in blockchain contexts as well. For example, NEAR uses Wasm to run its smart contracts, and has even been exploring the Component Model to take advantage of the benefits I'll be talking about.

NOTE: Below, I'll refer to "core" WebAssembly, which is to say the parts of WebAssembly specified in the MVP. The Component Model proposal extends core Wasm in various ways.

Wasm, aside from the abstract machine itself, also specifies the concept of a module, which can be expressed in text form (.wat), or in binary form (.wasm). A module contains the following items:

Types
Globals
Functions
Imports/Exports
Memory(s) (a linear address space like Miden)
Data Segments (specify the contents of memory starting at some offset from the start of a given memory)
Tables (essentially an array of opaque handles of a specific reference type, e.g. the funcref table, which holds typed function references, is used by the call.indirect instruction - a function "pointer" is simply an index into a funcref table, and if the index is invalid, or the call doesn't match the signature of the funcref, execution traps)
An optional "start" function, which is called when a module is instantiated, after all tables and memories are initialized. This isn't like main, but is instead used for things like global constructors which must be run before a program starts.

So, when you compile a Rust program to Wasm, what you get is a Wasm core module as described above, in binary form.

The Component Model

Unfortunately, core Wasm only has a very limited set of types to express things with (integers, floats, and opaque reference types like funcref). When compiling a high-level language to Wasm, you take high-level types like records (structs), strings, etc., and translate their representation into this limited set of primitive types in a language-specific way, losing the high-level type information in the process.

In order to interoperate between languages, both sides must agree on this low-level representation, and this contract is called an application binary interface, or ABI. For example, a Rust string and a JavaScript string will likely have completely different representations, so to allow JS code to call a function implemented in Rust, with a JS string as argument, either the Rust code must know the precise details of the JS string layout; the JS code must convert to the low-level representation expected by Rust; or, they must agree on some common representation.

This is obviously not ideal. The Rust-compiled module must explicitly export it's functionality using some lowest-common denominator representation in order to maximize portability; or alternatively, commit to a less portable, but more optimal representation that is only supported by specific languages/compilers. Not only that, but in both cases, this is a really fragile arrangement - changes (intentional or not) to the low-level representation could break in unexpected ways, with no way to know that the contract has been broken.

One of the key goals of the Component Model is to solve this problem. It does so by first expressing type definitions written in an interface description language (IDL) called WebAssembly Interface Types, or WIT. A component author describes the interfaces they wish to make public using WIT, using rich types (including things like strings and structs). Both the component author, and the downstream users of the component can generate bindings from the WIT description. The specification which describes how the rich WIT types are translated to core Wasm primitive types is called the Canonical ABI. This ABI defines not only the layout of the WIT types in terms of core Wasm types, but also the calling convention for component functions. Thus, a Rust-compiled module can provide a Rust-native API, express it in terms of WIT, and then generate Canonical ABI bindings for it that can be called from any language that also compiles to the Component Model.

Fundamentally, a component in the Component Model is a wrapper around a core module, that specifies its imports and exports in terms of WIT interfaces. Components are linked and "instantiated" at runtime. These component instances are a bit like OS processes spawned from an executable; the executable is stateless, and simply describes the contents of the program. Once spawned however, the process is stateful, it has specific resources allocated to it, and adheres to the normal process lifecycle. Component instances are very similar - the host runtime acts a bit like an OS kernel, and instantiating a component is equivalent to spawning a process; the host runtime will allocate whatever resources are needed to run the component, and then orchestrate its lifecycle, along with all of the other component instances which are active. Inter-component calls involve context switches between the instance of the caller and the instance of the callee, while intra-component calls all share the same context.

WIT interfaces are the means by which components can be easily reused across languages. You can call a Go function directly from Rust, and vice versa, without either side having to know anything about what language the component was written in. All that is required is the component interface, expressed in WIT.

Furthermore, components are designed to be linked together dynamically. If a component imports the wasi:http/handler interface, any component that implements that interface can fulfill that dependency, providing an incredible degree of modularity at runtime. This can be done ahead-of-time as well, as multiple components can be linked together into a larger component with fewer dependencies, and shipped as a unit.

A key design goal of the Component Model is to build on Wasm's sandboxing model, which is much like Miden. Components are designed for a shared-nothing architecture - component instances fully encapsulate their linear memories, tables, globals and other resources. Component interfaces only allow value types (no passing arguments by reference, as there is no pointer type), opaque typed handles (i.e. resources, and immutable uninstantiated modules/components. While handles and imports can be used as an indirect form of sharing, the dependency use cases documentation describes how this sharing can be finely controlled. Components are even more restrictive than core Wasm modules in some ways; for example, components may not export Wasm memory. This not only provides stronger sandboxing guarantees, but ensures that different components with different assumptions about memory can interoperate without issue.

One such example of how sharing can be precisely controlled, is link-time virtualization. In short, this allows, at link-time, an instance of a component to be created which replaces one or more interfaces imported by its children, with a "virtualized" instance of that interface. This does two things: 1.) prevents the children from having direct access to the "real" interface, and 2.) allows the parent to intercept calls to the interface and modify its behavior. This can be used for sandboxing use cases, instrumentation, etc.

All of this is described in the binary form of a component module, and allows a host runtime to analyze the component graph and reason about its properties and behavior before running the component.

Goals

So, to recap, the following are the primary goals of the Component Model which are relevant to Miden:

Define a portable, load- and run-time-efficient binary format for separately-compiled components built from WebAssembly core modules that enable portable, cross-language composition.
Support the definition of portable, virtualizable, statically-analyzable, capability-safe, language-agnostic interfaces.
Maintain and enhance WebAssembly's unique value proposition:
- Language neutrality: avoid biasing the component model toward just one language or family of languages.
- Optimizability: maximize the static information available to Ahead-of-Time compilers to minimize the cost of instantiation and startup. NOTE: And in our case, available to the Miden assembler and VM.
- Formal semantics: define the component model within the same semantic framework as core Wasm.

Non-Goals

To make sure it's clear what the Component Model does not aim to solve on its own:

Don't attempt to solve problems that are better solved by some combination of the toolchain, the platform or higher layer specifications, including:
- package management and version control;
- deployment and live upgrade / dynamic reconfiguration;
- persistence and storage; and
- distributed computing and partial failure.

Proposed Changes

So we've spent a bunch of time on set up, now let's look at some concrete details about what this proposal entails! The following are the major bullet points:

Build our Rust toolchain on top of the component model tooling already available and in use. Rust is adopting the component model as part of its roadmap for Wasm, and the latest version of WASI is built on it, so we're better off building our tooling in that direction anyway. In the process we gain a lot of first-class tooling that makes a lot of the "Miden SDK" work much more straightforward and maintainable. This is being actively explored/worked on.
Make the Wasm frontend of the compiler "component-aware". This work is already under way, because whether or not we end up implementing the component model throughout the rest of the toolchain, not only is our Rust tooling going to use components, but they solve a number of issues that I laid out in the [Background](#Background) section:
- Make sure compiled programs adhere to a well-specified common ABI that is cross-language compatible
- Be able to work with rich type information that reflects what was used in the original source language, and map that to the compiled code for analysis and optimization.
- Be able to differentiate parts of a program which should be call'd vs exec'd based on component interfaces
- Be able to rely on the shared-nothing nature of components to ensure that certain types of invalid code cannot be expressed, e.g. that pointers to memory of the caller won't be passed to a callee across a call boundary
- Be able to determine the dependencies of a component so that we can properly handle that at link-time, as well as being able to translate that information into the form used by the assembler
Change the Miden assembler to use components as its fundamental library/program abstraction. The current concept of a MASM module would remain, as would its existing import mechanism (use), at least initially. What would be new, is that MASM libraries/programs would be required to describe their public interface using WIT. To produce a final binary for the VM, the assembler would perform a linking step over the set of components provided, and if successful, would produce the final MAST, as well as a component manifest that describes the assembled component tree, along with the MAST roots associated with each interface. In binary form, this would be suitable for deployment as well as introspection by analysis tools.
Change the VM to be "component-aware", the most critical things needed for that are:
- When a program is loaded, the "root" component instance is created. Each component instance is associated with some state managed by the VM - this state is essentially the "context" object in the VM today, but rather than being associated with a specific function call, it would be associated with a component instance. If the component was instantiated as a result of a call instruction, the hash of the caller would be stored immutably in the instance state, in order to support the caller instruction from code within that instance.
- When the VM executes an exec instruction, it would handle context switches if the callee is in a different component instance than the caller. If the callee instance does not yet exist, it would be instantiated at this time.
- When the VM executes a call instruction, it will always create a new component instance for the callee, and set the MAST root of the caller as the caller for that instance. When a call returns, the instance would be scheduled for destruction on another thread (to avoid the main thread of execution having to incur the overhead of freeing resources associated with the instance).
- If we introduce the "start" function item from Wasm components into our model, we could use it to perform one-time initialization of component instances' memory when the instance is created. This would allow us to support global variables/data segments without requiring any additional features from the VM (or even Miden Assembly) - I still think it would be ideal to have the VM handle this as part of instantiating a component, but since there are unknowns around how we would represent that in the constraint system, "start" functions would get us something usable with little effort.

Differences between Miden and Wasm Components

Rather than fully specify the Miden component model here, I think it is easier to specify the ways in which I expect it to differ from the Wasm model:

We will need to extend the way components are linked to support imports by MAST root. Such imports would not allow virtualization, and would require that either the specific code referenced is provided, or linking fails. This should be fairly straightforward, as we can reserve a namespace for importing by MAST root, e.g. (import "mast:0x..." ..) . We can then special-case imports from that namespace in our linker implementation.
We do not need to support different embeddings like Wasm does, the Miden VM is the only one, so we can either ignore things irrelevant to us, or tailor them to our specific needs/constraints.
We need two binary formats, which are largely the same, but used for two different things:
- The format used to represent a MASM library or program as a component, where components wrap MASM modules
- The format used to represent the fully assembled MAST as a component, where components wrap MASTs
The Canonical ABI will need to be extended to cover how the canonical representation is translated to/from Miden types, as well as framing the calling convention in terms of how arguments/results are passed via the operand stack, and how to handle the situation where there is insufficient space on the operand stack and the advice provider must be used for spillage. Some of this is already informally specified in midenc, and some of it is not yet specified anywhere (spilling via the advice provider).
We will need to determine how to express call vs exec interfaces in WIT. A simple approach we're exploring in the short term is using marker attributes in doc comments in the WIT file itself - these are preserved in the binary encoding of WIT, and can be used to drive custom behavior in code generators as well as other tools such as midenc.

Benefits

In addition to solving a number of the problems I've laid out earlier in this document, I believe this proposal also has additional benefits that are worth considering/keeping in mind:

We gain a cohesive overall architecture in which to frame various aspects of Miden's implementation and toolchain.
We gain a useful lexicon with clear semantics that we can use to communicate how programs are written, compiled, packaged, deployed, and ultimately executed in Miden. At the core are the concepts of components that import/export interfaces, which consist of types and functions - and it is possible to talk about each layer of the stack in those terms, which is quite nice. As it stands currently, there is a disconnect between how language frontends think about programs vs modules and executables vs libraries, and what those terms mean in Miden Assembly, as well as the constraints it places on them. I believe components provide an opportunity to reduce and/or eliminate this disconnect, and provide a lot better clarity about how something in, say Rust, translates to what is ultimately executed by the machine.
We get, essentially for free, a principled approach to dependency management, that integrates well with existing language tooling, and is designed to support dependencies written in different languages, without having to even be aware what language the dependency is written in.
We get a self-describing package format for components, which provides rich type information about the component, what interfaces it exports, and which interfaces it depends on. This same format can be used by language frontends to generate native bindings for the component.
We get a canonical application binary interface (ABI) for the functions exported by an interface, which not only allows a component written in any language to satisfy that interface, but callers do need to know or care about what language that component was written in. The state of things today is that there is no ABI - each function written in Miden Assembly has, in effect, its own ad-hoc ABI. The Miden compiler makes some effort to correct this, by compiling functions defined in its IR with a common ABI, but this only works for functions compiled by the compiler.
Make our toolchain less dependent on our own compiler, and our own ad-hoc formats and tooling. By building on the component model, we can immediately support any language frontend that can compile to WebAssembly components without any involvement. This is because our compiler can translate Wasm to Miden Assembly. By additionally specifying a Miden component model, it would also be possible for compilers that want to natively target Miden Assembly, to do so without having to maintain compatibility with how our compiler emits artifacts, but also be able to trivially interact with code generated by our compiler. This is a key property of any good tooling ecosystem, and it benefits everyone, ourselves included.
We gain a type-safe way to call Miden Assembly functions from the host using native Rust types, and could even support passing in arguments from the command-line as public inputs.

Drawbacks

I think any good proposal needs to test its mettle by explaining why it might be a bad idea, so here goes:

Large in scope. This proposal involves changes across the entire toolchain, and may require changes to existing hand-written MASM code. This may require significant time from multiple developers to fully implement, review, test, and start using. On the other hand, the changes do not have to happen simultaneously, and there are options for maintaining backwards-compatibility in the assembler and VM until the final stages.
Hard to roll back. Once we reach a certain point in this effort, it will be difficult to switch directions. We will need to make sure we think through some of the key implementation details early on to ensure we identify any major risks, and make sure we always have a plan B for those risks if something doesn't work out as expected. Luckily, we can say with a very high degree of certainty that the Wasm component model is here to stay, and will be seeing significant adoption over the next year or so as Wasm/WASI usage increases. It is going through the standardization process, and there is already tooling available that you can use today for multiple languages. Furthermore, we have largely vetted the frontend aspects of this, so the major unknowns/risks (for me anyway) are in regards to the assembler and VM.
Complex. There are a lot of pieces in the puzzle here, which goes back to the scope of the proposal. Reasoning about how the pieces fit together is...a lot. Finding ways to break this down into smaller, easier to comprehend parts is pretty essential going forward. On the other hand, I'm not sure this proposal introduces new complexity - I think it already exists, it's just cloaked in shadow. One of the things I tried to do in this document is lay out a number of existing problems across the stack that I think the proposal solves for, and some of them are things we have basically been punting down the road. One of the major benefits of the component model, to me, is that it actually makes it a lot easier to think about the various pieces of our stack in terms of how they tie back to that model. Even problems we haven't started thinking about yet, like packaging and distribution, feel like they have a natural implementation when framed in terms of components.

I hope you found this useful, and at the very least thought-provoking! I'd really appreciate your feedback, and in particular I'd like to hear from the assembler/VM team with your initial thoughts. I'm available to discuss anytime!

bobbinth · 2023-12-07T03:28:38Z

bobbinth
Dec 7, 2023
Maintainer

Regarding more comprehensive component model, there are still a few things that I'm not sure are possible (e.g., one is how to handle component identification as I mentioned in 0xPolygonMiden/compiler#77 (comment)), but I wanted to think through a more narrow problem: static data segments.

I'm assuming we want to achieve the following properties for these:

We should be able to initialize such segments with some data. The amount of data could be pretty large (e.g., dozens of KB).
Ideally, the initialization should happen only once during program execution, and once the data is initialized, the program should not be able to change it.
Once initialized, we should be able to read from these segments just like from any other memory segment.

The above functionality is not something that Miden VM (or Miden assembly) supports out of the box, but we may be able to enable it via a combination of conventions, new instructions, and maybe even new overall concepts. Below, I'll start with the simplest case, and then will expand from there.

Single-context executable program

The simplest case is: an executable program which does not use any call or syscall instructions. Such programs we can probably handle even now. The basic approach would be:

Initialize the advice provider with the static data.
As the first thing, "unhash" the program into some memory location using a combination of adv.pipe and hperm instructions.

This approach should be relatively efficient: we can initialize the data at a rate of ~16 bytes per cycle (so, initializing 16KB should take about 1,000 cycles). We also do this only once - and this is really the best we can hope for.

The main question though is where to put this static data. One approach is to store it starting at $3 \cdot 2^{30}$. Then, memory layout would look something like this:

We could introduce some new syntax to MASM to improve ergonomics of handling such static data.

# declare static variables which will be stored in the static data segment
static.FOO=0x123.0x456.0x789
static.BAR=0x321.0x654

begin
    # put memory address of FOO onto the stack
    staddr.FOO

    # put memory address of BAR onto the stack
    staddr.BAR
end

In the above, the assembler would inject a program right after begin to write the static data into memory. It would also translate staddr.FOO into push.3*2^30 and staddr.BAR into push.3*2^30+3 (assuming element-addressable memory).

Library modules

The above approach works very well for executable files (in fact, I'm not sure we can do much better in terms of efficiency), but it doesn't quite work for library modules. Consider the following module:

static.FOO=0x123.0x456.0x789
static.BAR=0x321.0x654

export.foo
    staddr.FOO
end

export.bar
    staddr.BAR
end

There are two challenges with this:

At which point do we write the static data to memory?
Where in the memory do we write it?

One option is to write the data to memory any time either foo or bar is invoked and to write the data right after procedure locals. This would work but could be rather inefficient in many cases (i.e., in cases when there is a lot of static data but the function itself is not too complicated).

One way to improve on this is to make the initializer program a bit more sophisticated and put some metadata into the static memory section. For example, we could say that memory address $3 \cdot 2^{30}$ contains the starting address of un-initialized static memory. Then, initializing a new segment would work roughly as follows:

Check if a segment with a given hash has already been initialized.
If the segment has already been initialized, return its memory address.
If not, return the current value in $3 \cdot 2^{30}$, and increment the value in $3 \cdot 2^{30}$ by the size of the segment.
"Unhash" the segment into the memory address returned from the previous step.

This initializer would be run at the beginning of both foo and bar procedures. If static data has been previously initialized, this would be a pretty simple operation (basically, the cost of the check to see if the segment has already been initialized). If the data hasn't been initialized yet, we would initialize it - but this would happen only once.

The main question here is how do we perform the check in step 1 above. There are a couple of options here, but I still need to think through their pros and cons.

Multiple execution contexts

The above approach should work well as long as we don't use call operations to enter a new context. It may even work well if we enter a given context only once. So, for example, in the context of Miden rollup, using the above approach for notes will be fine because we execute a given note only once. However, it would not work as well for accounts because multiple procedures could be called on an account, and each call would require re-initialization of static segments.

One potential way to get around this is to always put static memory segment into the root context. That is, the initializer would always be invoked via a syscall and it would "unhash" segments in the root context's memory. To access this memory we'd need to introduce new instructions which would allow reading root memory from any context. This would mean that accessing static memory would be done slightly differently from accessing regular memory. But it would also guarantee that static memory is truly "static" (i.e., can be initialized only once).

Another downside is that "unhashing" segments generically (which we'd need to do if we want to write static segments into the root context), would reduce efficiency of initialization (e.g., it would probably at least double the number of cycles required to initialize a given amount of memory).

So, while this approach is probably acceptable - it's not without its tradeoffs, and I'll try to think about alternative approaches.

1 reply

bobbinth Dec 7, 2023
Maintainer

One other thought: if having separate instructions for reading static data is OK (e.g., something like stat_load instead of mem_load), then we may be able to provide real static data segment which can be shared across all contexts (this is instead of writing static data into some region of regular memory).

bobbinth · 2024-02-12T06:27:01Z

bobbinth
Feb 12, 2024
Maintainer

From 0xPolygonMiden/compiler#120 (comment)

This is probably a lot more general solution, but the amount of rodata (specifically the number of words which are non-zero) is not something I have good data on currently. Could you elaborate on how you estimated 4KB as the upper limit for this? I could actually see benefits to supporting both approaches - one for larger amounts of rodata (i.e. the fallback solution), and one for rodata that can adhere to the tighter constraints. The package manifest would then specify which of the two is expected to be used by the VM at initialization time.

This was just a "guesstimate" based on how much work I think a verifier may be able to do without impacting recursive proof verifications too much. Assuming each element contains 4 bytes of data, 4KB means we can initialize memory with about 1K elements. For each element, we'll need to 2 additions, 1 multiplication (in the extension field), and probably hashing as well. Overall (and this is a guess at this point), I hope that using some of the specialized instructions we have in the VM we can do all of this in under 5K cycles which may be on the border of what is acceptable (or overall target for recursive proof verification is ~65K cycles).

One thing to clarify: this 4KB limit is what we can do for the entire program (not for a given component). I also think that trying to support both approaches (at least from the start) - will probably add too much complexity. We can assume that for now we go with non-deterministic unhashing approach, and we can come back and evaluate this approach later on as an optimization.

One thing we should probably get a better handle on sooner rather than later is how much rodata do we think may be needed for "typical" smart contract. Here, defining what is typical will be tricky - but maybe we can make some assumptions (e.g., use of static strings will be minimal or absent altogether). I think knowing this will help us prune out some solution approaches.

Each account procedure has its own rodata which is required for its execution.

This won't be possible in virtually any existing compiler/language that I know of, due to how compilers in general try to handle non-primitive constant/read-only data. They are always de-duplicated and co-located globally, so trying to trace which bits of rodata are specific to a given function is not practical without some sophisticated static analysis and rewriting of the generated code.

This would of course be possible in a new language of our own design specifically for Miden. Another approach that technically would work for languages like Rust or C (to a limited extent) is breaking things up into multiple translation units (e.g. Rust crates). So each account function would be compiled into its own object, with just the rodata for that function and its local dependencies. That said, that is a pretty horrible solution for a variety of reasons, and would run into issues pretty quick with functions that need to call other functions on the same account (you'd need to merge both functions into a single crate, but now both functions pay the initialization overhead for their collective rodata).

Yep, understood that this is not how most compilers work, and we probably don't need per-function rodata right now. This is however, where we probably want to end up because otherwise there will be a natural limit on how complex smart contracts on Miden could get (i.e., a single function which requires lots of rodata would slow down every other function - even if they are not related). So, probably at HIR and assembly levels, but especially at the VM level, we should go with the implementation which supports per-function rodata.

One thing I'd like to understand is how this is handled in the component model. Wouldn't each component get its own separate rodata? And if so, could we make each function of the account interface its own component?

1 reply

bitwalker Feb 13, 2024
Collaborator Author

I also think that trying to support both approaches (at least from the start) - will probably add too much complexity. We can assume that for now we go with non-deterministic unhashing approach, and we can come back and evaluate this approach later on as an optimization.

Sounds reasonable, I think we can certainly get away without both, so going with the more flexible of the two makes sense. I'm not actually sure which of the two approaches is more complex; but I think reducing the number of changes we have to make in the VM, and focusing efforts there on the more important primitives I've identified is the better tradeoff to make for sure.

One thing we should probably get a better handle on sooner rather than later is how much rodata do we think may be needed for "typical" smart contract. Here, defining what is typical will be tricky - but maybe we can make some assumptions (e.g., use of static strings will be minimal or absent altogether). I think knowing this will help us prune out some solution approaches.

One thing to be aware of is that string data is probably the least interesting rodata we have to support, though certainly plentiful (especially if any kind of logging/eventing/messaging stuff is in use). At least as common are other various forms of constants in a program: const/static slices/arrays/structures - but those are basically just a different shape of data. The more complicated form of "rodata" is actually the kind that is initialized by the runtime at program start (what I'll call static initialization). This corresponds to initializers of all the global variables of a program that cannot be initialized ahead-of-time due to various reasons. These initializers must be run before any other code in the program. For Rust/C/C++, the actual implementation of this is typically handled by the C runtime and its loader/linker (e.g. crt0 and rtdyld). On WebAssembly specifically, this manifests as _start (for executable programs) and _initialize functions (for libraries). The _start function corresponds to the crt0 function of the same name, and is responsible for setting up all of the global program state before invoking main. The _initialize function is basically the equivalent for libraries, and in other environments is typically invoked by rtdyld after loading a library into memory, but before returning from a call to dlopen, and does basically all the same stuff as _start, sans the bits specific to executable programs. To be clear, libc doesn't actually have to be part of the picture for this to be used with Wasm. The Wasm Component Model actually formalizes this and let's a component specify the start function to be called when a new instance of the component is created (though in practice I assume this will basically always be _start or _initialize).

What remains to be seen is how much of that kind of initialization will be needed in a typical smart contract (perhaps little to none in some cases), but the one thing we know for sure is that we have to implement support for this somehow, since it is baked in at a very low level and is not something that can be managed/controlled by users easily (or at all), and it will have to be done both when a program is started, and in each new context that is created (where the callee in the new context depends on it).

I'll keep you posted as I get more concrete data on what more realistic programs look like in terms of static init and data segments, so that we can at least get a sense for how that affects proofs/verification; but it's definitely not a question of "if", but "how much".

So, probably at HIR and assembly levels, but especially at the VM level, we should go with the implementation which supports per-function rodata.

For sure, and we do track rodata in the IR at the function level as well, it's just de-duplicated/merged at the program level currently; but that doesn't actually help us with Rust-compiled code, because by the time we get it, it's no longer tracked at that level of granularity (i.e. we just see the blobs of data, we don't know which functions in the program reference them).

One thing I'd like to understand is how this is handled in the component model. Wouldn't each component get its own separate rodata?

Yes, each component consists of one or more core Wasm modules, with their own linear memory, and memory cannot be shared across components.

And if so, could we make each function of the account interface its own component?

This is more or less a variation of the "each function gets its own translation unit" solution I mentioned. Each component will need to be defined in its own crate, and you'll run into the issue of how to share code for the account across those boundaries. That said, it is an option, but the ergonomics are quite bad (the degree of which depends on how many functions the account has, and how much sharing is needed, etc.).

I think at that point, we might be better off finding a way to plug in to the Rust compiler at a deeper level, or do something like use the Cranelift backend to generate .clif that we then translate to HIR (since this would preserve the function-local metadata AFAIK, depending on how rustc itself uses the Cranelift backend). Alternatively, coming up with some kind of clever analysis that lets us recover that information from the compiled Wasm where possible, and falling back to "initialize everything" if we can't, might also be an option - but I think all of those solutions are things we should not consider pursuing until we have gotten much further along and are focused mostly on optimization (IMO).

bobbinth · 2024-02-12T08:46:12Z

bobbinth
Feb 12, 2024
Maintainer

Comments for 0xPolygonMiden/compiler#119 (review)

First, I think the overall assessment in the above note is correct: we don't know yet if we can support the component model fully (or rather how much work it will take to get there, and whether the tradeoffs make sense). My understanding was that for now we were using the component model primarily as a way of describing the ABI (instead of coming up with our own ABI format) rather than going all the way to assuming full functionality of the component model. But maybe these are more inter-related than I thought?

Second, before getting into the details about supporting component model within the VM, I think it may be worth aligning on the priorities. In my mind, the main thing we need to achieve sooner rather than later is to be able to compile programs written in Rust targeted specifically for the Miden rollup. We, of course, want to support more general things - but these can come later.

Concretely, the above means:

Cross-language compatibility is a "nice-to-have" at this point. That is, if we can support it even now without too much extra work, we should do it. But if it does take non-negligible amount of extra work, it is OK if we do it later (i.e., 1 - 2 years down the road).
We can limit the cross-context calls we care about to just note scripts calling account interface functions. Yes, the VM supports call instructions more generally, but I think for the first version of the compiler/SDK we can simplify things quite a bit (unless, of course, supporting more general functionality comes at little extra cost).

This is not to say that we shouldn't work toward supporting full component model in the VM, but I also don't want to solve a bigger problem than what we currently have to.

Onto more specific comments:

Currently, only the root context is truly persistent, while all other contexts are transient, i.e. they only live as long as the call which instantiated them.

This is not technically correct. All contexts are persistent - it is just there are no instructions available to access them. We could very easily add instructions which, given a context ID, could read or write memory of any of the previously created context (or even, using non-determinism, we may be able to write into future contexts as well).

Support for context-owned resources, with the same basic semantics as Wasm component model resources. When a context is instantiated, one of the bits of data associated with that context is a table of resource handles that have been created (initially empty). The code running in that context may request that resources be created/destroyed, operations which modify that table. A context would be kept alive until all of its resources are freed, or until the top-level program stops executing (either normally, or due to a trap). Calling any function on a resource would switch to the context which created it. Resources would then form the foundation of a capability object model used by code such as the transaction kernel which require guarantees around access control (amongst other things) that are tricky to implement correctly (and with sufficient generality) any other way (especially with the other features mentioned above present).

This may require a pretty fundamental change to the VM. Currently, a context is defined solely by its memory space (i.e., the only thing that is different between two contexts is the memory they can access), however, it seems like here a context is defined by its resource table - right?

Another fundamental change is how we authenticate which procedures are allowed to execute in a given context. As you noted, the VM currently does not do any authentication at all. In the context of the Miden rollup, it is the kernel who is responsible for authenticating procedures. Basically, in the current model the VM + the kernel work together to define resources and who is allowed to access them. Or said another way, the resources live in the kernel memory and it is the kernel who creates a semblance of a component model for the Miden rollup.

In the proposed model, we'd need shift this responsibility to the VM. Again, I'm fully onboard that this is the right way to do it long term (and in fact, when I was designing the VM I wanted to do something like this - but ran into some limitations on the ZK side; these limitations may be lifted by the GKR work discussed in #1182), but I also would like to avoid solving a bigger problem then we currently have to.

The bigger issue though, is that without 1 and 2, the overhead imposed by context switches is enormous for functions which don't need a fresh component instance (i.e. functions for which exec or syscall would be used).

For what we need now, the only time we need to do a context switch is when a note calls an account interface procedure - everything else could be abstracted away from the compiler. For example, the compiler would not need to emit any syscalls (and if we switch to a true component model, these would probably go away anyway - right?). Assuming we can find a solution to the rodata problem discussed in the other thread, note-to-account calls should not impose too much of an overhead (or at least, not more overhead that we could have hoped for).

Even more problematic, without the Wasm component model support in Rust, we would find ourselves back at square one in figuring out how to ensure that cross-context calls don't make assumptions about shared memory; how to represent such calls at all in Rust, etc. This is basically true of any language built on a shared-by-default memory model.

I'd like to understand this point better (and maybe we can discuss it live) because, in the context of note-to-account calls, I think the solution could be relatively simple. But it is possible that I'm missing something.

Is the current authentication/authorization model used by the transaction kernel actually workable outside of handwritten Miden Assembly? If we don't have some mechanism by which to implement a more robust security model, and it turns out that due to the code emitted by compilers like Rust, invoking kernel procedures happens too deep in the call stack (I seem to recall there being a pretty hard limit here), what's our workaround?

There are no limitations on how deep in the call stack a call to a kernel procedure can happen. But also, for what we need now, the compiler may not need to emit explicit calls to the kernel at all (though, again, this may be something to discuss).

0 replies

bitwalker · 2024-02-13T05:15:07Z

bitwalker
Feb 13, 2024
Collaborator Author

My understanding was that for now we were using the component model primarily as a way of describing the ABI (instead of coming up with our own ABI format) rather than going all the way to assuming full functionality of the component model. But maybe these are more inter-related than I thought?

Basically, I think the initial outcome I arrived at was that we could rely on the Wasm tooling to solve the following:

WebAssembly Interface Types (WIT) provides us with a format for describing the interfaces and types of a component in a language agnostic way. Tooling consumes WIT to generate language-specific bindings that ensure that, at the boundaries, all parties are using a single, "canonical" ABI. We would extend this a bit in the compiler to handle the actual mechanical details of representing the canonical ABI in Miden terms, but the vast majority of the work is done by the generated bindings.
WIT itself is just the textual format. What it describes, and the actual tooling that uses it, is intrinsically linked to the WebAssembly Component Model (CM) and its semantics. The CM is what specifies the canonical ABI. More broadly though, the CM is broken up into two logical parts (at least, the way I think about it): static, represented by component "types" (i.e. describes the name, imports, and exports of a component); and dynamic, represented by component "instances" (i.e. corresponds to what I've been referring to as "contexts" in Miden)¹. Up until now, we've been operating under the assumption that we can restrict ourselves to just making use of the canonical ABI specification, and the "static" semantics of the CM.
cargo-component (and wit-bindgen under the covers) would form the basis (or at least be used to prototype) our Rust tooling for integrating WIT, and managing component dependencies and other metadata relevant to compiling to Miden. This gives us a big head start on Rust tooling, and handles automating a lot of the more tedious aspects of building/managing Wasm components in Rust, e.g. generating bindings.

What seems increasingly apparent to me, is that we have been trying to make local/isolated decisions about various concerns (e.g. what the function call ABI should look like, how is memory initialized, how are modules linked/dependencies managed, how are they packaged, what is in a package, how is it published, etc.), when in fact a decision on any one of these matters impacts everything else in subtle ways. Finding the right set of primitives could provide solutions (and/or a clear path) for all of these, and likely things we haven't yet considered.

I keep coming back to the Wasm CM for two reasons: 1.) it provides a unifying framework for virtually all of the topics I mentioned above, and elegant solutions to them; and 2.) the degree of overlap it has with our domain, and the set of problems we're working to solve (and will need to solve at some point). To recap the main benefits:

A well-specified, and supported, format for consumption by the compiler (which we are already starting to make use of)
Efficient cross-language interop/compatibility (essential not just for supporting multiple languages on Miden, but for stability guarantees within the context of a single language)
Components encapsulate their state, and have shared-nothing semantics, just like how contexts in Miden are intended to work. This is the opposite of how Rust artifacts (and most languages) are by default. Having Rust-native tooling that enforces this provides us with the necessary guarantees that a program compiled from Rust to Miden is sound.
Resources are simple and elegant, and of particular interest to us (IMO) provide a way to implement Move-like resource semantics in a cross-language compatible fashion, but more importantly, form the foundation of a capability-based security model (for example, the tx kernel could use these to provide a much richer auth primitive for Miden-based contracts that would not be nearly as dependent on the caller instruction).
Component instances (and their associated objects, like linear memory and resources) also map quite cleanly to Miden's concept of contexts (if one ignores the current restrictions around them in Miden). If we can find a way to extend Miden contexts, supporting the core primitives of the CM (at least the important ones) are actually quite simple on the runtime side of things.

Second, before getting into the details about supporting component model within the VM, I think it may be worth aligning on the priorities. In my mind, the main thing we need to achieve sooner rather than later is to be able to compile programs written in Rust targeted specifically for the Miden rollup. We, of course, want to support more general things - but these can come later.

I agree, though I would caveat my agreement on what constitutes "general" here - I'm definitely a fan of hacking things together to make something work when I know replacing it later won't be a huge deal. That said, a lot of what we're talking about here are so foundational that trying to change them later will not be a small effort, where (depending on the effort involved) solving them properly now will let us move faster across the board. If we decide to go with a temporary solution, it may not be possible to avoid breaking users of Miden later when we need to replace that with a proper solution, is that going to be an acceptable tradeoff? I guess what I'm getting at is, in the spectrum of tradeoffs between speed and technical debt, we really want to make sure that the "interest" on that debt doesn't drown us.

Cross-language compatibility is a "nice-to-have" at this point. That is, if we can support it even now without too much extra work, we should do it. But if it does take non-negligible amount of extra work, it is OK if we do it later (i.e., 1 - 2 years down the road).

I think we essentially get this "for free" just by using Wasm and the CM on the frontend of the compiler, to a certain extent. One thing to note though, is that interop between Rust and Miden Assembly is also part of the cross-language compatibility picture, so we can't fully kick that can down the road.

We can limit the cross-context calls we care about to just note scripts calling account interface functions. Yes, the VM supports call instructions more generally, but I think for the first version of the compiler/SDK we can simplify things quite a bit (unless, of course, supporting more general functionality comes at little extra cost).

I believe we can probably do this, and in fact was the "temporary" solution @greenhat and I discussed today in our call. That said, there are still some aspects of this that could become a problem. I'm in particular thinking of how calling an account function is authenticated/authorized, which I think you had mentioned to me at some point in the past has some assumptions baked into it that we may not be able to meet with compiler-generated code. We should revisit the specifics of that protocol to see if anything sticks out to us now that we're further along; but let's assume for a second the current mechanism won't work as-is, what are the scope of options there?

Supporting context switches between arbitrary contexts is the only obstacle to being able to implement CM resources, which would provide a richer primitive in Miden for such things. There would be additional work related to resource management itself, but in comparison to context switching it is much simpler and straightforward. With both context switches and resources, we have everything we actually need from the CM, and everything else is completely optional from my perspective.

This is not to say that we shouldn't work toward supporting full component model in the VM, but I also don't want to solve a bigger problem than what we currently have to.

For sure. I do want to be more clear that the "full" CM is really not what we're after, it really is just some of the more basic primitives/semantics. That unlocks everything I think we'd need in practice. But yeah, if we can get away with less, I'm cool with that. I just want to make sure that we're actually getting away with it. If you'll permit me a metaphor: we're playing a game of Jenga with all of these pieces, trying to find which ones we can remove and put on top of the tower to deal with later, without bringing everything down. Later when we have to rewrite some part of the system "properly", it's basically pulling that piece back out, and trying to put it back in its original place. My worry is that if we get too eager to remove things at this stage of the game, we may find ourselves in a position where we can't pull any more pieces without bringing the whole tower down.

There's no way to know for sure though if that'll happen, so that's why I'm hoping to quantify what the actual choices are we're making (in terms of actual effort, time to implement/deliver, knock-on effects, etc.). It is also possible we may have to actually do some portion of the work just to be able to do that much, e.g. I don't know how else we can quantify the problem of arbitrary context switches without you (or someone with your knowledge) actually figuring out (at least in theory) what the possibilities are for representing that in the constraint system. From there I think we can work things out more easily, but that's still (to me) the biggest unknown, and makes evaluating the different paths here hard to do.

Onto more specific comments:

I'm going to follow up on these in a separate comment to avoid this one getting too long. I shouldn't be too long, maybe an hour or two.

Component instances are described statically (i.e. what the component is, and how to instantiate it), but only exist at runtime. A resource in CM has a representation both as a type (which describes the name and interface of the resource) and a runtime entity (every component instance has its own table of resources that it has instantiated, and operations on a given resource must go through the component instance that instantiated it). The runtime semantics are specified by the CM, but are implemented by a WebAssembly runtime (e.g. wasmtime). ↩

0 replies

bitwalker · 2024-02-13T08:50:03Z

bitwalker
Feb 13, 2024
Collaborator Author

This is not technically correct. All contexts are persistent - it is just there are no instructions available to access them. We could very easily add instructions which, given a context ID, could read or write memory of any of the previously created context (or even, using non-determinism, we may be able to write into future contexts as well).

Makes sense, though I think in terms of semantics, the notion that a context is transient is effectively true, even though technically the contexts are never cleaned up until after the program terminates. Once a call returns, the context it executed in is inaccessible. The distinction is only important in terms of support - since contexts aren't in practice persistent today, that gives us flexibility in terms of how we implement changes to context behavior.

In any case, I think ideally we would keep contexts an internal implementation detail of the VM, corresponding essentially to a CM component "instance", and managing the context switching implicitly based on some semantics we define (e.g. call always corresponds to a fresh instance, exec may implicitly context switch if calling across a component boundary, instantiating the context as needed; invoking a CM resource method would always switch to the context which created it, or some variation on these).

This may require a pretty fundamental change to the VM. Currently, a context is defined solely by its memory space (i.e., the only thing that is different between two contexts is the memory they can access), however, it seems like here a context is defined by its resource table - right?

To be clear, a context would become something more than just an index/identifier for the linear memory in use; but in the CM, there are two bits of "context:

A component instance, which tracks state associated with the instantiation of a component, which currently is the table of resource handles associated with that instance, e.g. Vec<(ResourceType, Vec<ResourceHandler>)>, and two booleans that are used to track the two trap/reentrancy invariants
A call context which tracks state associated with a specific call to another component, and consists of the canonical ABI options for the call, a reference/index to the component instance of the callee, a list of borrowed resource handles (from the callee), and a borrow count (which tracks how many borrows are outstanding in the callee, incremented on entry, decremented as resources are released, and traps if the call exits with resources still outstanding).

So I think it's probably more accurate to say that there are really two bits of state here that have slightly different lifetimes. However, there is flexibility in how this state is managed, what the limits are (maximum resources, borrows, etc.) to accommodate implementors. The only thing that actually matters is that the semantics are enforced (e.g. around borrowing), how it's actually done is completely up to implementors.

Just to elaborate a bit on what constitutes a "resource" - in Wasm CM semantics, the raw representation of a resource handle is an "opaque address" in the form of a u32 (though that representation is up to the implementor); that representation is not actually accessible from Wasm, but for the Wasm runtime, it is presumed to be an index into the resource table of the owning component instance. An entry in the resource table consists of the following bits of information: the unique identifier of the resource handle, the type id of the resource, whether the resource handle is owned or borrowed, the number of "lends" (if owned), the number of "borrows" (if borrowed), and the index of the destructor function for the resource (if one is defined). There are two synthetic functions implemented by the runtime, and exposed to the component instance defining the resource resource.new (which given a type id, allocates a new owning resource handle of that type), and resource.drop (which is called whenever an instance releases a resource handle, whether owned or borrowed). The canonical ABI also specifies how resource handles are borrowed/lent when passed as arguments/returned as values, which manipulates the resource table for both caller/callee instances. Most of the state being tracked is around ownership/borrowing, to ensure resource semantics are upheld.

Another fundamental change is how we authenticate which procedures are allowed to execute in a given context. As you noted, the VM currently does not do any authentication at all. In the context of the Miden rollup, it is the kernel who is responsible for authenticating procedures. Basically, in the current model the VM + the kernel work together to define resources and who is allowed to access them. Or said another way, the resources live in the kernel memory and it is the kernel who creates a semblance of a component model for the Miden rollup.

My understanding though is that the current authentication is pretty brittle due to the limitations of what we can do with the tools we have, and as a result, doesn't allow for future growth to support more subtle composition of notes/accounts, e.g. delegating some action to another note/account/etc. on your behalf using a restricted capability that permits just that action to be taken and nothing else. That might not even be something we care about though. I suppose the relevant question right now is really about how brittle the current mechanism is, and whether it will present an issue due to the limits on how much control we have over compiled programs. As I seem to recall the current mechanism assumes that the call stack is of a particular depth (or something to that effect, it's been quite awhile since we discussed it, and I can't remember if it was call stack depth, MAST depth, or something else). As I mentioned in my previous comment, confirming those details and whether or not it will be a problem will be useful for weighing our options here.

In the proposed model, we'd need shift this responsibility to the VM. Again, I'm fully onboard that this is the right way to do it long term (and in fact, when I was designing the VM I wanted to do something like this - but ran into some limitations on the ZK side; these limitations may be lifted by the GKR work discussed in #1182), but I also would like to avoid solving a bigger problem then we currently have to.

Makes sense, I agree that if we can punt on resources initially, it would be preferable, even though it would give us more options in terms of defining stuff in Rust, I don't think that's nearly as important right now as context switching to be honest. Having context switches available would at least make resources an option in the future.

For what we need now, the only time we need to do a context switch is when a note calls an account interface procedure - everything else could be abstracted away from the compiler. For example, the compiler would not need to emit any syscalls (and if we switch to a true component model, these would probably go away anyway - right?). Assuming we can find a solution to the rodata problem discussed in the other thread, note-to-account calls should not impose too much of an overhead (or at least, not more overhead that we could have hoped for).

If we punt on cross-language compatibility, and resources, then agreed, the only other "calls" we have are from notes to accounts, and potentially syscalls (there is no difference between calls and syscalls from my perspective in terms of effort, they are both implemented using the same codegen approach, and all of that is implemented already anyway, aside from the ABI aspects).

Having the compiler handle rote mechanical details (such as canonical ABI lowering/lifting) is what compilers are good at, and we should lean on it for those sort of things. What we want to "abstract away" from it though, are details which are best left up to the VM (for example, how memory is initialized). If expose those internals so the compiler can coordinate with the VM on something, then any change to the VM implementation (or vice versa) is a breaking change; whereas if the VM handles the details of e.g., initializing memory (with a higher level "public" interface for the compiler), then we can changes much more freely later on. Some brittleness at this stage is fine though, so I'm not saying we can't have the compiler and VM know each other's implementation details, but in terms of deciding which things we want to provide "primitives" for, that's the criteria I would use.

The main problem of course is memory initialization at call boundaries, once we figure out our solution there, we at least have a path forward.

I'd like to understand this point better (and maybe we can discuss it live) because, in the context of note-to-account calls, I think the solution could be relatively simple. But it is possible that I'm missing something.

To be clear I was referring to the fact that if we didn't use the Wasm CM tooling/ABI/etc. at all, we'd find ourselves having to reimplement a bunch of it anyway, simply because Rust as a language assumes a shared-everything model, where all code being executed is in the same address space. This means that Rust might make codegen decisions based on that assumption which would produce unsoundness if those assumptions are violated. One of the things the CM gives us, is Rust support for a shared-nothing model when calling across components.

To reiterate from earlier, we can restrict our use of the CM for just note scripts/accounts to provide us that useful property, but then manage everything else using the standard Rust dependency model. It restricts us somewhat in what we can express, but we can ship something, which is the critical bit.

There are no limitations on how deep in the call stack a call to a kernel procedure can happen. But also, for what we need now, the compiler may not need to emit explicit calls to the kernel at all (though, again, this may be something to discuss).

Ah ok, this is one of the answers to the questions I had. I could've sworn there was some other hardcoded thing in there, but I must be misremembering.

I think the main thing for me that I would like to quantify is:

What are our hard limits in the VM
How do we measure/estimate whether we are going to hit those or not (like, can we take say, a struct layout, convert it into some number of trace columns/constraints/whatever, plug that into a formula and be able to say, "nope, that won't work, we're X% over Y limit, let's try a different encoding of these two fields", rinse/repeat?)
With regard to switching between arbitrary contexts, what would it take to pin down the time/effort to implement the most basic support for it (for example, preserving the requirement that context ids be "instantiated" in monotonically increasing fashion, but making it possible for the VM to switch to any context that was previously instantiated via call; no changes to the instruction set, or existing procedure call semantics, just making it possible to support arbitrary context switching at the most fundamental level).
Similarly, looking at how resource semantics work in Wasm CM as a future Miden extension, what are the specific constraints we're going to have to work around, and what do we think it will take to make a minimal version of that work.

After discussing everything tomorrow, and reviewing the data for the above (to the extent we can come up with some), it'll be a lot clearer what our options and tradeoffs are. I know I'll certainly feel a lot better about where we land if we have some data backing us up, even if where we land is "not implementing any of this".

0 replies

bobbinth · 2024-02-14T02:04:02Z

bobbinth
Feb 14, 2024
Maintainer

A quick note on how we can use the advice provider to move large amounts of data across call boundaries.

To set the context: when we execute the call instruction, only the top 16 elements of the stack are accessible to the callee (in case of dyncall this is reduced to 12 elements as the first 4 elements are occupied by the MAST root of the function we are calling). Thus, if we need to pass more data than can fit into 16 elements (i.e., 64 - 128 bytes, depending on representation), we need to use the advice provider.

Similarly, when the callee returns, it is also able to use only the top 16 elements of the stack (in fact, the callee must ensure that it doesn't have more than 16 elements on the stack on return). And so here too we need to use the advice provider to transfer large amounts of data.

The process for doing consists of the following steps:

Compute the hash of the data we want to pass from one context to another.
Add an entry to the advice map using hash as the key and the data as the value.
Put the hash of the data onto the stack and execute call instruction (or return from the call).
Then, once we are in the target context, move data from the advice provider into memory and at the same time compute the hash of the data that was retrieved from the advice provider.
Make sure that the hash computed is the same hash as was passed in via the stack.

To compute the hash of some region of memory, we can rely on mem_stream instruction. This instruction will read 2 words (8 elements) from memory starting at the specified address (address is in the 12th slot from the top of the stack), and will also increment the address by 2 (so that it points to the next 2 word) - all of this is done in 1 cycle.

Another instruction we need is hperm. This instruction executes a single permutation of the RPO hash function.

Here is an example of how these would come together to compute hash 16 words starting at some memory address ptr:

# starting stack state
# => [ptr, ...]

# allocate 12 elements on the stack that we need reading/hashing
padw padw padw
# => [Z, Z, Z, ptr, ...]

repeat.8
    # read the next 2 words from memory starting at ptr
    mem_stream
    # => [mem[ptr], mem[ptr+1], Z, ptr+2, ...]

    # apply RPO permutation (A, B, C are the results of the permutation)
    hperm
    # => [A, B, C, ptr+2, ...]
end

# we are done hashing, the result is in word B, so we drop the other words
dropw swapw dropw
# => [B, ptr+16, ...]

# and if we don't need the pointer any more either, we can drop it too
movup.4 drop
# => [B, ...]

Though, we probably don't want to drop the pointer at the end because for the next step, we'd use adv.insert_mem instruction. This instruction will create an entry in the advice map using the word at the top of the stack as the key. The memory region is defined by the elements following the key. For example:

# starting stack state
# => [KEY, start_ptr, end_ptr, ...]

# create an entry in the advice map: KEY |-> mem[start_ptr..end_ptr]
adv.insert_mem
# => [KEY, start_ptr, end_ptr, ...]

After this, we can execute the call instruction (or return from a call). Then, to "unhash" the data in the new context, we need to rely on the adv_pipe instruction (together with hperm instruction again. For this, we actually have helper procedures in the Miden stdlib:

pipe_preimage_to_memory is the one that does steps 4 and 5 above and handles it for any number of words.
pipe_double_words_to_memory does just the step 4 (so, we still need to compare the hashes) and works only if the number of words is divisible by 2, but it is also quite a bit more efficient.
Lastly, if we know exactly how many words we want to "unhash", we can avoid using loops and make things even more efficient.

2 replies

bitwalker Feb 15, 2024
Collaborator Author

Thanks @bobbinth! That's quite helpful!

For memory initialization then, the code of the callee could be compiled with the size in words and commitment baked-in as constants, with the expectation that the advice map contains the rodata under that commitment as key, so that we can use pipe_preimage_to_memory immediately in the callee's prologue to initialize its memory. The same would basically be used for the context in which the program entrypoint is executed.

That at least makes the code we generate relatively small, even if it results in quite a bit of overhead in terms of execution (we'll have to quantify this with some real code).

bobbinth Feb 18, 2024
Maintainer

Yes - that's exactly how it would work. Since overall efficiency is not a primary concern yet, I think using pipe_preimage_to_memory works just fine.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Component Model for Miden #1171

{{title}}

Replies: 6 comments 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

RFC: Component Model for Miden #1171

bitwalker Dec 3, 2023 Collaborator

Background

Component Model?

WebAssembly Refresher

The Component Model

Goals

Non-Goals

Proposed Changes

Differences between Miden and Wasm Components

Benefits

Drawbacks

Replies: 6 comments · 4 replies

bobbinth Dec 7, 2023 Maintainer

Single-context executable program

Library modules

Multiple execution contexts

bobbinth Dec 7, 2023 Maintainer

bobbinth Feb 12, 2024 Maintainer

bitwalker Feb 13, 2024 Collaborator Author

bobbinth Feb 12, 2024 Maintainer

bitwalker Feb 13, 2024 Collaborator Author

Footnotes

bitwalker Feb 13, 2024 Collaborator Author

bobbinth Feb 14, 2024 Maintainer

bitwalker Feb 15, 2024 Collaborator Author

bobbinth Feb 18, 2024 Maintainer

bitwalker
Dec 3, 2023
Collaborator

Replies: 6 comments 4 replies

bobbinth
Dec 7, 2023
Maintainer

bobbinth Dec 7, 2023
Maintainer

bobbinth
Feb 12, 2024
Maintainer

bitwalker Feb 13, 2024
Collaborator Author

bobbinth
Feb 12, 2024
Maintainer

bitwalker
Feb 13, 2024
Collaborator Author

bitwalker
Feb 13, 2024
Collaborator Author

bobbinth
Feb 14, 2024
Maintainer

bitwalker Feb 15, 2024
Collaborator Author

bobbinth Feb 18, 2024
Maintainer