Bindings render pipeline #2333

bendk · 2024-11-27T20:56:08Z

Reviewers: I like these changes a lot, but I want to explore one more approach for this. In the new approach, the PythonIr and template code will stay essentially the same, but how we handle the pipeline that transforms the metadata into that PythonIr will be much different. If you want to review, you probably want to just focus on the Python changes, skipping convert_ir.rs.

Rework bindings rendering to work more like a compiler pipeline:

Start with the metadata
Transform the metadata into an intermediate representation. This stage currently goes through the ComponentInterface, but if we decide to adopt this approach then we can rework it to be a more straightforward transformation.
Transform the IR into a language-specific IR. Walk the tree of IR nodes using the VisitMut trait and update the names, format the docstring, add language-specific data, etc.
Render the language-specific IR into code.

The first commit documents this in some more detail.

One nice feature of this system is that you can inspect the data at any stage of the pipeline, run it through diffs, etc. I think this will really help both new and experienced devs to understand what's going on during bindings generation. I documented my plans for this tool, but haven't implemented it yet.

There's a ton of changes here, but I think it's worth it.

bendk · 2024-11-27T20:57:59Z

@lougeniaC64 flagging you for review since you said you'd take a look at the new docs. Note that these docs are for a proposed new system. I'd love your take on both the docs themselves and if the overall process makes sense.

bendk · 2024-11-27T21:01:28Z

uniffi_bindgen/src/bindings/python/gen_python/visit_mut.rs

+    pub protocols: Vec<Protocol>,
+    pub runtimes: Runtimes,
+    pub exports: Vec<String>,
+}


This module essentially replaces the CodeOracle system (as well as a couple others). Is it better to have all of the code in one place or try to split it out over multiple modules? I think I like it all in one place, but I could be convinced to try to split it up.

One thing I definitely like over the old system is that there's a lot less indirection involved: BindingsIrVisitor sets data, then the filters read it and that's basically the whole story.

bendk · 2024-11-27T21:05:06Z

uniffi_bindgen/src/bindings/python/gen_python/visit_mut.rs

+                    Some(meta) => {
+                        let module_name = self.config.module_for_namespace(&meta.namespace);
+                        format!("{module_name}._UniffiRustBuffer.default()")
+                    }


This is another thing I like about the new system, we always have access to the Config which makes referencing the external rust buffer more straightforward.

Before we didn't want to do this because it means that ffi_type_name() and any other function that depended on it would need to input the config, which was deemed too annoying.

badboy · 2024-11-28T12:41:29Z

Wuih, big change! I won't be able to review this this year though.

bendk · 2024-11-29T16:02:19Z

I got the peek and diff working. I've been having fun playing around with them by calling diff-save, making some changes to an example crate, then calling diff [metadata|ir|python-ir] to see how those changes affect the pipeline.

bendk · 2024-12-02T20:32:37Z

Wuih, big change! I won't be able to review this this year though.

Yeah it's a big one alright, there's no rush on getting it merged.

bendk · 2024-12-07T22:08:04Z

I've been thinking about what this mean for external bindings. I think they have 2 choices:

Buy in to this concept: implement VisitMut to specialize the bindings like Python does in this PR, maybe add the peek and diff commands to their CLI.
Don't buy in: switch to using BindingsIr instead of ComponentInterface, but keep everything else the same. This limits the scope to minor changes, like using &rec.name instead of rec.name() and that's about it.

So, I don't think we need consensus or anything on this, but if all the external bindings authors find this weird and not useful then we probably should reconsider.

This comes from my experiences with mozilla#2333 and uniffi-bindgen-gecko-js. For those, I needed to run a lot of tests while (re)implementing a bindings generator. I love how the current tests cover basically all UniFFI features, but I think there a few things that can be improved and that's what this commit does. The new tests target specific features and are named after them, rather than trying to mimic real-world examples. Before, I spent a lot of time trying to figure out what a test failure meant. I'm hoping that when one of these tests fail, it's obvious what's not working. It also means that the new test suites don't require writing so much foreign code. They also have a better external bindings story. I'm thinking we can direct external buindings authors to copy everything inside the `bindings-tests` directory and use that to test their bindings. They can use the test macro now, since it's doesn't hard code the mapping from file extension to test runner. Copying the code feels a bit weird, but I like it better than having to publish all these test fixture crates and I think it will work okay in practice. I still want to keep the current examples/fixtures. They're nice because they showcase real-world use-cases and test that those use-cases work like we expect them to. If this gets merged, we can rework those crates to focus on that. Maybe we rework them to only test one bindings language per example/fixture.

This comes from my experiences with mozilla#2333 and uniffi-bindgen-gecko-js. For those, I needed to run a lot of tests while (re)implementing a bindings generator. I love how the current tests cover basically all UniFFI features, but I think there a few things that can be improved. This commit adds a suite of tests specifically aimed at the bindings. The new tests target specific features rather than trying to mimic real-world examples. Before, I spent a lot of time trying to figure out what a test failure meant. I'm hoping that when one of these tests fail, it's obvious what's not working. It also means that the new test suites don't require writing so much foreign code. They also have a better external bindings story. I'm thinking we can direct external buindings authors to copy everything inside the `bindings-tests` directory and use that to test their bindings. They can use the test macro now, since it's doesn't hard code the mapping from file extension to test runner. Copying the code feels a bit weird, but I like it better than having to publish all these test fixture crates and I think it will work okay in practice. I still want to keep the current examples/fixtures. They're nice because they showcase real-world use-cases and test that those use-cases work like we expect them to. If this gets merged, we can rework those crates to focus on that. Maybe we rework them to only test one bindings language per example/fixture.

This comes from my experiences with mozilla#2333 and uniffi-bindgen-gecko-js. For those, I needed to run a lot of tests while (re)implementing a bindings generator. I love how the current tests cover basically all UniFFI features, but I think there a few things that can be improved. This commit adds a suite of tests specifically aimed at the bindings. The new tests target specific features rather than trying to mimic real-world examples. Before, I spent a lot of time trying to figure out what a test failure meant. I'm hoping that when one of these tests fail, it's obvious what's not working. It also means that the new test suites don't require writing so much foreign code. They also have a better external bindings story. I'm thinking we can direct external buindings authors to copy everything the `bindings-tests/fixtures` sub-directories and write their own version of `bindings-tests/tests/tests.rs`. They get to use the macro now, since it's doesn't hard code the mapping from file extension to test runner. Copying the code feels a bit weird, but I like it better than having to publish all these test fixture crates and I think it will work okay in practice. I still want to keep the current examples/fixtures. They're nice because they showcase real-world use-cases and test that those use-cases work like we expect them to. If this gets merged, we can rework those crates to focus on that. Maybe we rework them to only test one bindings language per example/fixture.

This comes from my experiences with mozilla#2333 and uniffi-bindgen-gecko-js. For those, I needed to run a lot of tests while (re)implementing a bindings generator. I love how the current tests cover basically all UniFFI features, but I think there a few things that can be improved. This commit adds a suite of tests specifically aimed at the bindings. The new tests target specific features rather than trying to mimic real-world examples. Before, I spent a lot of time trying to figure out what a test failure meant. I'm hoping that when one of these tests fail, it's obvious what's not working. It also means that the new test suites don't require writing so much foreign code. They also have a better external bindings story. I'm thinking we can direct external bindings authors to copy everything the `bindings-tests/fixtures` sub-directories and write their own version of `bindings-tests/tests/tests.rs`. They get to use the macro now, since it's doesn't hard code the mapping from file extension to test runner. Copying the code feels a bit weird, but I like it better than having to publish all these test fixture crates and I think it will work okay in practice. I still want to keep the current examples/fixtures. They're nice because they showcase real-world use-cases and test that those use-cases work like we expect them to. If this gets merged, we can rework those crates to focus on that. Maybe we rework them to only test one bindings language per example/fixture.

jhugman · 2024-12-19T14:42:32Z

This is a meta-comment on this PR.

I haven't had the bandwidth to dig into this.

I very much like the use of the mermaid diagrams you've already done; could I request that you add another one that shows the current state, so we can get an idea of a migration path?

On the whole, I very much like the compiler metaphor: moving more logic out of the templates, having a principled approach for imports and once per type and once per type of type would be super helpful, especially for newer bindgens.

I'd also applaud having an escape hatch: I agree that there absolutely must be at least one upgrade path that has minimal breaking changes (e.g. type aliasing BindingsIR as ComponentInterface and deprecating the name() method rather than removing it). I'm happy to switch to a new way, but don't make me do it immediately :)

Added `Enum::variant_discr_iter` and use that to implement `variant_discr`. This provides a simple and efficient way for other code to get a Vec of discriminants. Renamed `uniffi_checksum_derive` -> `uniffi_internal_macros`. This way we can use the same crate for future internal macros. This is prep work for the bindings IR patch I'm hoping to land.

I'm hoping to overhaul the way we generate foreign bindings by modeling the process as a series of data transformations, similar to a compiler pipeline. This commit documents the plan for the new system.

I'm going to be defining a bunch of types here that are very similar to the current ones in `interface`. Before doing that, I wanted copy the current code over to improve the diff for the next commit.

This is a reimangining of the ComponentInterface, modeled after a compiler pipeline. The differences are: - Data-oriented API. It has `pub` fields rather than getter methods. - Tree-like. Fields represent nodes in the tree. You can understand `BindingsIr` by following the fields. - Data is stored directly in the nodes, for example types now have an `is_used_as_error` field instead of having to lookup the type name in `ComponentInterface::errors`. - Each node contains a `LanguageData` container. Bindings generators will put language-specific data in here as part of the bindings IR specialization process. For example: `type_name`, `ffi_converter` and `ffi_type_name`. To accomplish this, `Type`, `FfiType` and `Literal` are now structs, which store a `kind` enum. - It tries to make a stronger distinction between definitions and references to those definitions. - Type and FFI types are sorted in dependency order. This replaces the the Python bindings code that does something similar to avoid forward references. Languages can opt-in to the new API by converting the `ComponentInterface` into a `BindingsIr`. At some point we start converting the metadata directly to `BindingsIr` and remove a ton of ComponentInterface code. Let's make sure to give external bindings authors time to move to the new system.

bendk · 2024-12-23T16:00:17Z

After talking with @mhammond and @gruberb about this one, I decided to try a slightly different approach. Rather than using the VisitMut trait and the LanguageData field, models this as more of a regular conversion from BindingsIr to PythonBindingsIr.

I like this approach because it seems more straight forward. Before the conversion was handled by a combination of mutating fields, filling in the lang_data field (technically also a mutation, but it felt much different than the other ones), and also collecting new items in the BindingsIrVisitor fields. It worked, but it seemed a bit strange.

The main downside of this new approach is we need to duplicate some structs. However, I don't really mind it for a couple reasons:

The Python structs are significantly different than the ones in interface::ir. The fact that some of it is duplicated doesn't seem that bad to me.
I used a macro to implement traits like AsType. Duplicating all of this boilerplate impl code was the thing I really wanted to avoid but the macros solve this well.

The second main change was making Callable a type and introducing the AsCallable trait, which works kind-of like AsType. This new system works better with the macros and also makes converting slightly easier, since the map_callable method replaces a bunch of duplicated code from map_function/map_method/map_constructor.

I'm still liking the peek and diff commands. I had a few errors when re-implementing all of the code and peek came in very handy to fix them.

I'm liking this one better, but I'd love feedback from others on it. Here's the old changes for comparison.

bendk · 2024-12-23T16:04:34Z

I very much like the use of the mermaid diagrams you've already done; could I request that you add another one that shows the current state, so we can get an idea of a migration path?

I think this is a great idea, but I'm not sure how to diagram the current state. That's one of the things I'm hoping to improve -- the new system has a more cohesive story to me than the old one.
super helpful, especially for newer bindgens.

I'd also applaud having an escape hatch: I agree that there absolutely must be at least one upgrade path that has minimal breaking changes (e.g. type aliasing BindingsIR as ComponentInterface and deprecating the name() method rather than removing it). I'm happy to switch to a new way, but don't make me do it immediately :)

I didn't touch ComponentInterface code, this new code lives on top of it. I think we can keep ComponentInterface around for a couple versions and so that external bindings authors have a while before they need to switch.

The new system is to do a `BindingsIr` specialization pass, which means implementing `VisitMut` and using it to: - Rewrite names, docstrings, etc. - Derive things like FFI converter names and store them in `lang_data` - Generate language-specific things like imports as we walk the tree Along the way I made a couple changes to the Python bindings: Changed `_uniffi_rust_call_async` to not lift the return value. This makes it fit in with the sync logic better and also enables async constructors (which store the pointer in `self` rather than lift it). Added the `CustomTypeConfig::type_name`. While updating the code I noticed the current templates handle this in a slightly buggy way where they assume the type name is the builtin type, even if there's a `from_custom`/`into_custom` that converts to a different type. Removed almost all of the macros, `define_callable` is the only one left.

We now have tons of messy/duplicated code to generate the bindings. If this PR gets merged, I'll try to clean it up as a follow-up.

lougeniaC64 · 2025-01-23T02:24:23Z

docs/manual/src/internals/bindings_render_pipeline.md

+
+```rust
+// Medata buffer for the `arithmetic::add()` function
+const UNIFFI_META_CONST_ARITHMETIC_FUNC_ADD: ::uniffi::MetadataBuffer = ::uniffi::MetadataBuffer::from_code(


nit: It might be helpful to also show the add function that generated this metadata buffer

lougeniaC64 · 2025-01-23T02:40:47Z

docs/manual/src/internals/bindings_render_pipeline.md

+```
+
+
+## Metadata from the UDL file


nit: Since there aren't details about this process (which I'm assuming is because pro-macros are favored over UDLs), you could remove this section, generalize the heading for the proc-macro metadata section, and add this as a note there.

lougeniaC64 · 2025-01-23T02:44:31Z

docs/manual/src/internals/bindings_render_pipeline.md

+
+## Specializing the BindingsIr
+
+This phase transforms the general `BindingsIr` into a language-specific one.  Each bindings generate


I think, "Each bindings generate . . . " should probably be, "Each binding generated . . . "

lougeniaC64 · 2025-01-23T02:47:18Z

docs/manual/src/internals/bindings_render_pipeline.md

+
+* Language-specific data is added, like concrete type names and the names of the `FfiConverter` classes that lift/lower types
+* New top-level items will be added to the language-specific `BindingsIr` struct, for example a list of import statements.
+* Fields like `name` and `docstring` will transformed to the preferred style for the language.


nit: " . . . will be transformed . . ."

lougeniaC64 · 2025-01-23T02:52:36Z

@lougeniaC64 flagging you for review since you said you'd take a look at the new docs. Note that these docs are for a proposed new system. I'd love your take on both the docs themselves and if the overall process makes sense.

Sorry for the delay. I left a few nits but overall the docs look good! I know you talked about maybe breaking out the bindings generation phases so processing was more transparent to uniffi devs, but wouldn't peek and diff be sufficient?

bendk requested a review from a team as a code owner November 27, 2024 20:56

bendk requested review from badboy and removed request for a team November 27, 2024 20:56

bendk commented Nov 27, 2024

View reviewed changes

bendk force-pushed the push-wsxllovuyuyv branch 3 times, most recently from b7b5e53 to df760d4 Compare November 28, 2024 00:50

badboy removed their request for review November 28, 2024 12:41

bendk force-pushed the push-wsxllovuyuyv branch 3 times, most recently from 85dea8b to 20353ac Compare November 29, 2024 16:00

bendk force-pushed the push-wsxllovuyuyv branch from 20353ac to d3ac762 Compare November 29, 2024 16:04

bendk mentioned this pull request Dec 3, 2024

CLI to print/diff generated sources #2308

Closed

bendk mentioned this pull request Dec 9, 2024

[WIP] Better bindings tests #2348

Open

bendk mentioned this pull request Dec 12, 2024

Fix JNA bug MethodTooLargeException by splitting big UniffiLib interface #2344

Merged

bendk added 2 commits December 23, 2024 10:40

Bindings IR docs/proposal

e4cc5fa

I'm hoping to overhaul the way we generate foreign bindings by modeling the process as a series of data transformations, similar to a compiler pipeline. This commit documents the plan for the new system.

Copying code into interface::ir::nodes

ffcdb23

I'm going to be defining a bunch of types here that are very similar to the current ones in `interface`. Before doing that, I wanted copy the current code over to improve the diff for the next commit.

bendk force-pushed the push-wsxllovuyuyv branch 2 times, most recently from 4340bfb to e8b6432 Compare December 23, 2024 15:45

bendk force-pushed the push-wsxllovuyuyv branch from e8b6432 to 4aeec98 Compare December 23, 2024 15:47

bendk force-pushed the push-wsxllovuyuyv branch from 4aeec98 to b3ff2b6 Compare December 23, 2024 16:24

bendk added 2 commits December 23, 2024 13:19

Adding the Peek/diff commands

f868fb0

We now have tons of messy/duplicated code to generate the bindings. If this PR gets merged, I'll try to clean it up as a follow-up.

bendk force-pushed the push-wsxllovuyuyv branch from b3ff2b6 to f868fb0 Compare December 23, 2024 18:19

lougeniaC64 reviewed Jan 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bindings render pipeline #2333

Bindings render pipeline #2333

bendk commented Nov 27, 2024 •

edited

Loading

bendk commented Nov 27, 2024

bendk Nov 27, 2024 •

edited

Loading

bendk Nov 27, 2024

badboy commented Nov 28, 2024

bendk commented Nov 29, 2024

bendk commented Dec 2, 2024

bendk commented Dec 7, 2024 •

edited

Loading

jhugman commented Dec 19, 2024

bendk commented Dec 23, 2024

bendk commented Dec 23, 2024

lougeniaC64 Jan 23, 2025 •

edited

Loading

lougeniaC64 Jan 23, 2025

lougeniaC64 Jan 23, 2025

lougeniaC64 Jan 23, 2025 •

edited

Loading

lougeniaC64 commented Jan 23, 2025


		## Specializing the BindingsIr

		This phase transforms the general `BindingsIr` into a language-specific one. Each bindings generate

Bindings render pipeline #2333

Are you sure you want to change the base?

Bindings render pipeline #2333

Conversation

bendk commented Nov 27, 2024 • edited Loading

bendk commented Nov 27, 2024

bendk Nov 27, 2024 • edited Loading

Choose a reason for hiding this comment

bendk Nov 27, 2024

Choose a reason for hiding this comment

badboy commented Nov 28, 2024

bendk commented Nov 29, 2024

bendk commented Dec 2, 2024

bendk commented Dec 7, 2024 • edited Loading

jhugman commented Dec 19, 2024

bendk commented Dec 23, 2024

bendk commented Dec 23, 2024

lougeniaC64 Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

lougeniaC64 Jan 23, 2025

Choose a reason for hiding this comment

lougeniaC64 Jan 23, 2025

Choose a reason for hiding this comment

lougeniaC64 Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

lougeniaC64 commented Jan 23, 2025

bendk commented Nov 27, 2024 •

edited

Loading

bendk Nov 27, 2024 •

edited

Loading

bendk commented Dec 7, 2024 •

edited

Loading

lougeniaC64 Jan 23, 2025 •

edited

Loading

lougeniaC64 Jan 23, 2025 •

edited

Loading