Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bindings render pipeline #2333

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Bindings render pipeline #2333

wants to merge 6 commits into from

Conversation

bendk
Copy link
Contributor

@bendk bendk commented Nov 27, 2024

Rework bindings rendering to work more like a compiler pipeline:

  • Start with the metadata
  • Transform the metadata into an intermediate representation. This stage currently goes through the ComponentInterface, but if we decide to adopt this approach then we can rework it to be a more straightforward transformation.
  • Transform the IR into a language-specific IR. Walk the tree of IR nodes using the VisitMut trait and update the names, format the docstring, add language-specific data, etc.
  • Render the language-specific IR into code.

The first commit documents this in some more detail.

One nice feature of this system is that you can inspect the data at any stage of the pipeline, run it through diffs, etc. I think this will really help both new and experienced devs to understand what's going on during bindings generation. I documented my plans for this tool, but haven't implemented it yet.

There's a ton of changes here, but I think it's worth it.

@bendk bendk requested a review from a team as a code owner November 27, 2024 20:56
@bendk bendk requested review from badboy and removed request for a team November 27, 2024 20:56
@bendk
Copy link
Contributor Author

bendk commented Nov 27, 2024

@lougeniaC64 flagging you for review since you said you'd take a look at the new docs. Note that these docs are for a proposed new system. I'd love your take on both the docs themselves and if the overall process makes sense.

pub protocols: Vec<Protocol>,
pub runtimes: Runtimes,
pub exports: Vec<String>,
}
Copy link
Contributor Author

@bendk bendk Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module essentially replaces the CodeOracle system (as well as a couple others). Is it better to have all of the code in one place or try to split it out over multiple modules? I think I like it all in one place, but I could be convinced to try to split it up.

One thing I definitely like over the old system is that there's a lot less indirection involved: BindingsIrVisitor sets data, then the filters read it and that's basically the whole story.

Some(meta) => {
let module_name = self.config.module_for_namespace(&meta.namespace);
format!("{module_name}._UniffiRustBuffer.default()")
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another thing I like about the new system, we always have access to the Config which makes referencing the external rust buffer more straightforward.

Before we didn't want to do this because it means that ffi_type_name() and any other function that depended on it would need to input the config, which was deemed too annoying.

@bendk bendk force-pushed the push-wsxllovuyuyv branch 3 times, most recently from b7b5e53 to df760d4 Compare November 28, 2024 00:50
@badboy
Copy link
Member

badboy commented Nov 28, 2024

Wuih, big change! I won't be able to review this this year though.

@badboy badboy removed their request for review November 28, 2024 12:41
@bendk bendk force-pushed the push-wsxllovuyuyv branch 3 times, most recently from 85dea8b to 20353ac Compare November 29, 2024 16:00
@bendk
Copy link
Contributor Author

bendk commented Nov 29, 2024

I got the peek and diff working. I've been having fun playing around with them by calling diff-save, making some changes to an example crate, then calling diff [metadata|ir|python-ir] to see how those changes affect the pipeline.

@bendk bendk force-pushed the push-wsxllovuyuyv branch from 20353ac to d3ac762 Compare November 29, 2024 16:04
@bendk
Copy link
Contributor Author

bendk commented Dec 2, 2024

Wuih, big change! I won't be able to review this this year though.

Yeah it's a big one alright, there's no rush on getting it merged.

@bendk
Copy link
Contributor Author

bendk commented Dec 7, 2024

I've been thinking about what this mean for external bindings. I think they have 2 choices:

  • Buy in to this concept: implement VisitMut to specialize the bindings like Python does in this PR, maybe add the peek and diff commands to their CLI.
  • Don't buy in: switch to using BindingsIr instead of ComponentInterface, but keep everything else the same. This limits the scope to minor changes, like using &rec.name instead of rec.name() and that's about it.

So, I don't think we need consensus or anything on this, but if all the external bindings authors find this weird and not useful then we probably should reconsider.

bendk added a commit to bendk/uniffi-rs that referenced this pull request Dec 9, 2024
This comes from my experiences with mozilla#2333 and uniffi-bindgen-gecko-js.
For those, I needed to run a lot of tests while (re)implementing a
bindings generator.  I love how the current tests cover basically all
UniFFI features, but I think there a few things that can be improved and
that's what this commit does.

The new tests target specific features and are named after them, rather
than trying to mimic real-world examples.  Before, I spent a lot of time
trying to figure out what a test failure meant.  I'm hoping that when
one of these tests fail, it's obvious what's not working.  It also means
that the new test suites don't require writing so much foreign code.

They also have a better external bindings story.  I'm thinking we can
direct external buindings authors to copy everything inside the
`bindings-tests` directory and use that to test their bindings.  They
can use the test macro now, since it's doesn't hard code the mapping
from file extension to test runner.  Copying the code feels a bit weird,
but I like it better than having to publish all these test fixture
crates and I think it will work okay in practice.

I still want to keep the current examples/fixtures. They're nice because
they showcase real-world use-cases and test that those use-cases work
like we expect them to. If this gets merged, we can rework those crates
to focus on that.  Maybe we rework them to only test one bindings
language per example/fixture.
bendk added a commit to bendk/uniffi-rs that referenced this pull request Dec 9, 2024
This comes from my experiences with mozilla#2333 and uniffi-bindgen-gecko-js.
For those, I needed to run a lot of tests while (re)implementing a
bindings generator.  I love how the current tests cover basically all
UniFFI features, but I think there a few things that can be improved and
that's what this commit does.

The new tests target specific features and are named after them, rather
than trying to mimic real-world examples.  Before, I spent a lot of time
trying to figure out what a test failure meant.  I'm hoping that when
one of these tests fail, it's obvious what's not working.  It also means
that the new test suites don't require writing so much foreign code.

They also have a better external bindings story.  I'm thinking we can
direct external buindings authors to copy everything inside the
`bindings-tests` directory and use that to test their bindings.  They
can use the test macro now, since it's doesn't hard code the mapping
from file extension to test runner.  Copying the code feels a bit weird,
but I like it better than having to publish all these test fixture
crates and I think it will work okay in practice.

I still want to keep the current examples/fixtures. They're nice because
they showcase real-world use-cases and test that those use-cases work
like we expect them to. If this gets merged, we can rework those crates
to focus on that.  Maybe we rework them to only test one bindings
language per example/fixture.
bendk added a commit to bendk/uniffi-rs that referenced this pull request Dec 9, 2024
This comes from my experiences with mozilla#2333 and uniffi-bindgen-gecko-js.
For those, I needed to run a lot of tests while (re)implementing a
bindings generator.  I love how the current tests cover basically all
UniFFI features, but I think there a few things that can be improved and
that's what this commit does.

The new tests target specific features and are named after them, rather
than trying to mimic real-world examples.  Before, I spent a lot of time
trying to figure out what a test failure meant.  I'm hoping that when
one of these tests fail, it's obvious what's not working.  It also means
that the new test suites don't require writing so much foreign code.

They also have a better external bindings story.  I'm thinking we can
direct external buindings authors to copy everything inside the
`bindings-tests` directory and use that to test their bindings.  They
can use the test macro now, since it's doesn't hard code the mapping
from file extension to test runner.  Copying the code feels a bit weird,
but I like it better than having to publish all these test fixture
crates and I think it will work okay in practice.

I still want to keep the current examples/fixtures. They're nice because
they showcase real-world use-cases and test that those use-cases work
like we expect them to. If this gets merged, we can rework those crates
to focus on that.  Maybe we rework them to only test one bindings
language per example/fixture.
bendk added a commit to bendk/uniffi-rs that referenced this pull request Dec 9, 2024
This comes from my experiences with mozilla#2333 and uniffi-bindgen-gecko-js.
For those, I needed to run a lot of tests while (re)implementing a
bindings generator.  I love how the current tests cover basically all
UniFFI features, but I think there a few things that can be improved.
This commit adds a suite of tests specifically aimed at the bindings.

The new tests target specific features rather than trying to mimic
real-world examples.  Before, I spent a lot of time trying to figure out
what a test failure meant.  I'm hoping that when one of these tests
fail, it's obvious what's not working.  It also means that the new test
suites don't require writing so much foreign code.

They also have a better external bindings story.  I'm thinking we can
direct external buindings authors to copy everything inside the
`bindings-tests` directory and use that to test their bindings.  They
can use the test macro now, since it's doesn't hard code the mapping
from file extension to test runner.  Copying the code feels a bit weird,
but I like it better than having to publish all these test fixture
crates and I think it will work okay in practice.

I still want to keep the current examples/fixtures. They're nice because
they showcase real-world use-cases and test that those use-cases work
like we expect them to. If this gets merged, we can rework those crates
to focus on that.  Maybe we rework them to only test one bindings
language per example/fixture.
bendk added a commit to bendk/uniffi-rs that referenced this pull request Dec 10, 2024
This comes from my experiences with mozilla#2333 and uniffi-bindgen-gecko-js.
For those, I needed to run a lot of tests while (re)implementing a
bindings generator.  I love how the current tests cover basically all
UniFFI features, but I think there a few things that can be improved.
This commit adds a suite of tests specifically aimed at the bindings.

The new tests target specific features rather than trying to mimic
real-world examples.  Before, I spent a lot of time trying to figure out
what a test failure meant.  I'm hoping that when one of these tests
fail, it's obvious what's not working.  It also means that the new test
suites don't require writing so much foreign code.

They also have a better external bindings story.  I'm thinking we can
direct external buindings authors to copy everything the
`bindings-tests/fixtures` sub-directories and write their own version of
`bindings-tests/tests/tests.rs`. They get to use the macro now, since
it's doesn't hard code the mapping from file extension to test runner.
Copying the code feels a bit weird, but I like it better than having to
publish all these test fixture crates and I think it will work okay in
practice.

I still want to keep the current examples/fixtures. They're nice because
they showcase real-world use-cases and test that those use-cases work
like we expect them to. If this gets merged, we can rework those crates
to focus on that.  Maybe we rework them to only test one bindings
language per example/fixture.
bendk added a commit to bendk/uniffi-rs that referenced this pull request Dec 10, 2024
This comes from my experiences with mozilla#2333 and uniffi-bindgen-gecko-js.
For those, I needed to run a lot of tests while (re)implementing a
bindings generator.  I love how the current tests cover basically all
UniFFI features, but I think there a few things that can be improved.
This commit adds a suite of tests specifically aimed at the bindings.

The new tests target specific features rather than trying to mimic
real-world examples.  Before, I spent a lot of time trying to figure out
what a test failure meant.  I'm hoping that when one of these tests
fail, it's obvious what's not working.  It also means that the new test
suites don't require writing so much foreign code.

They also have a better external bindings story.  I'm thinking we can
direct external bindings authors to copy everything the
`bindings-tests/fixtures` sub-directories and write their own version of
`bindings-tests/tests/tests.rs`. They get to use the macro now, since
it's doesn't hard code the mapping from file extension to test runner.
Copying the code feels a bit weird, but I like it better than having to
publish all these test fixture crates and I think it will work okay in
practice.

I still want to keep the current examples/fixtures. They're nice because
they showcase real-world use-cases and test that those use-cases work
like we expect them to. If this gets merged, we can rework those crates
to focus on that.  Maybe we rework them to only test one bindings
language per example/fixture.
@jhugman
Copy link
Contributor

jhugman commented Dec 19, 2024

This is a meta-comment on this PR.

I haven't had the bandwidth to dig into this.

I very much like the use of the mermaid diagrams you've already done; could I request that you add another one that shows the current state, so we can get an idea of a migration path?

On the whole, I very much like the compiler metaphor: moving more logic out of the templates, having a principled approach for imports and once per type and once per type of type would be super helpful, especially for newer bindgens.

I'd also applaud having an escape hatch: I agree that there absolutely must be at least one upgrade path that has minimal breaking changes (e.g. type aliasing BindingsIR as ComponentInterface and deprecating the name() method rather than removing it). I'm happy to switch to a new way, but don't make me do it immediately :)

Added `Enum::variant_discr_iter` and use that to implement `variant_discr`.
This provides a simple and efficient way for other code to get a Vec of
discriminants.

Renamed `uniffi_checksum_derive` -> `uniffi_internal_macros`.  This way
we can use the same crate for future internal macros.

This is prep work for the bindings IR patch I'm hoping to land.
I'm hoping to overhaul the way we generate foreign bindings by modeling
the process as a series of data transformations, similar to a compiler
pipeline.

This commit documents the plan for the new system.
I'm going to be defining a bunch of types here that are very similar to
the current ones in `interface`.  Before doing that, I wanted copy the
current code over to improve the diff for the next commit.
@bendk bendk force-pushed the push-wsxllovuyuyv branch 2 times, most recently from 4340bfb to e8b6432 Compare December 23, 2024 15:45
This is a reimangining of the ComponentInterface, modeled after a
compiler pipeline.  The differences are:

- Data-oriented API.  It has `pub` fields rather than getter methods.
- Tree-like.  Fields represent nodes in the tree.  You can understand
  `BindingsIr` by following the fields.
- Data is stored directly in the nodes, for example types now have an
  `is_used_as_error` field instead of having to lookup the type name in
  `ComponentInterface::errors`.
- Each node contains a `LanguageData` container.  Bindings generators
  will put language-specific data in here as part of the bindings IR
  specialization process.  For example: `type_name`, `ffi_converter` and
  `ffi_type_name`.  To accomplish this, `Type`, `FfiType` and `Literal` are
  now structs, which store a `kind` enum.
- It tries to make a stronger distinction between definitions and
  references to those definitions.
- Type and FFI types are sorted in dependency order.  This replaces the
  the Python bindings code that does something similar to avoid forward
  references.

Languages can opt-in to the new API by converting the
`ComponentInterface` into a `BindingsIr`. At some point we start
converting the metadata directly to `BindingsIr` and remove a ton of
ComponentInterface code. Let's make sure to give external bindings
authors time to move to the new system.
@bendk bendk force-pushed the push-wsxllovuyuyv branch from e8b6432 to 4aeec98 Compare December 23, 2024 15:47
@bendk
Copy link
Contributor Author

bendk commented Dec 23, 2024

After talking with @mhammond and @gruberb about this one, I decided to try a slightly different approach. Rather than using the VisitMut trait and the LanguageData field, models this as more of a regular conversion from BindingsIr to PythonBindingsIr.

I like this approach because it seems more straight forward. Before the conversion was handled by a combination of mutating fields, filling in the lang_data field (technically also a mutation, but it felt much different than the other ones), and also collecting new items in the BindingsIrVisitor fields. It worked, but it seemed a bit strange.

The main downside of this new approach is we need to duplicate some structs. However, I don't really mind it for a couple reasons:

  • The Python structs are significantly different than the ones in interface::ir. The fact that some of it is duplicated doesn't seem that bad to me.
  • I used a macro to implement traits like AsType. Duplicating all of this boilerplate impl code was the thing I really wanted to avoid but the macros solve this well.

The second main change was making Callable a type and introducing the AsCallable trait, which works kind-of like AsType. This new system works better with the macros and also makes converting slightly easier, since the map_callable method replaces a bunch of duplicated code from map_function/map_method/map_constructor.

I'm still liking the peek and diff commands. I had a few errors when re-implementing all of the code and peek came in very handy to fix them.

I'm liking this one better, but I'd love feedback from others on it. Here's the old changes for comparison.

@bendk
Copy link
Contributor Author

bendk commented Dec 23, 2024

I very much like the use of the mermaid diagrams you've already done; could I request that you add another one that shows the current state, so we can get an idea of a migration path?

I think this is a great idea, but I'm not sure how to diagram the current state. That's one of the things I'm hoping to improve -- the new system has a more cohesive story to me than the old one.
super helpful, especially for newer bindgens.

I'd also applaud having an escape hatch: I agree that there absolutely must be at least one upgrade path that has minimal breaking changes (e.g. type aliasing BindingsIR as ComponentInterface and deprecating the name() method rather than removing it). I'm happy to switch to a new way, but don't make me do it immediately :)

I didn't touch ComponentInterface code, this new code lives on top of it. I think we can keep ComponentInterface around for a couple versions and so that external bindings authors have a while before they need to switch.

@bendk bendk force-pushed the push-wsxllovuyuyv branch from 4aeec98 to b3ff2b6 Compare December 23, 2024 16:24
The new system is to do a `BindingsIr` specialization pass, which means
implementing `VisitMut` and using it to:
  - Rewrite names, docstrings, etc.
  - Derive things like FFI converter names and store them in `lang_data`
  - Generate language-specific things like imports as we walk the tree

Along the way I made a couple changes to the Python bindings:

Changed `_uniffi_rust_call_async` to not lift the return value.  This
makes it fit in with the sync logic better and also enables async
constructors (which store the pointer in `self` rather than lift it).

Added the `CustomTypeConfig::type_name`.  While updating the code I
noticed the current templates handle this in a slightly buggy way where
they assume the type name is the builtin type, even if there's a
`from_custom`/`into_custom` that converts to a different type.

Removed almost all of the macros, `define_callable` is the only one left.
We now have tons of messy/duplicated code to generate the bindings.  If
this PR gets merged, I'll try to clean it up as a follow-up.
@bendk bendk force-pushed the push-wsxllovuyuyv branch from b3ff2b6 to f868fb0 Compare December 23, 2024 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants