Skip to content

Conversation

@dangotbanned
Copy link
Collaborator

@dangotbanned dangotbanned commented Sep 16, 2025

Related issues

If you have comments or can explain your changes, please do so below

Thanks for the ping @ym-pett!

I had a few commits, but I forgot to check permissions first 😅

Show screenshot

image

I'm guess there's something in docs/permission-levels-for-a-personal-account-repository?


Anyway, here I am ...

Please feel free to fire away any comments/questions/suggestions before merging 🙂

@ym-pett
Copy link
Owner

ym-pett commented Sep 16, 2025

ohh this is brilliant @dangotbanned! I will pull to my local and have a play with it and talk myself through the changes! (I think I need to add you as a collaborator to my forked repo, will try that)

Comment on lines +97 to +108
Returns:
If the following conditions are met
- at least 1 plugin is installed
- at least 1 installed plugin supports `type(native_object)`
Then for the **first matching plugin**, the result of the call below.
This *should* be an object accepted by a Narwhals Dataframe, Lazyframe, or Series:
plugin: Plugin
plugin.__narwhals_namespace__(version).from_native(native_object)
In all other cases, `None` is returned instead.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this to describe the current behavior, but I'm not saying it should be the end goal.

(more qualifying) we don't need to change how this works any time soon 😅
But I'm curious if this description sparks any ideas on potential issues it may have?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm I sense a rhetorical question, but I don't quite see far enough. I wondered about the line

compliant_object = plugins.from_native(native_object, version) in translate.py.

But it tries this and if it returns None, carries on with the rest of translate. (I tried it by uninstalling nw-daft & running some nw example with pandas). So that's not a problem..

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, I think the examples worked, I'm a little embarrassed now that I didn't see the problem first time round: we only get a call to the plugin.__narwhals_namespace with the first matching plugin, if I've got this right, it won't care about the next plugin after that, even if it had the necessary functions?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to be embarrassed!

if I've got this right, it won't care about the next plugin after that, even if it had the necessary functions?

💯

)
from narwhals.utils import Version


Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also going to use this review to talk myself through the code and check my understanding with you, hope that's ok.

is __all__ below for protecting wildcard imports? i.e. stopping non-public ones getting imported too? (commenting out line doesn't break code)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also going to use this review to talk myself through the code and check my understanding with you, hope that's ok.

No problem at __all__ 😉

is __all__ below for protecting wildcard imports? i.e. stopping non-public ones getting imported too? (commenting out line doesn't break code)

Yeah that is one of the reasons

The rest is documented in: https://typing.python.org/en/latest/spec/distributing.html#library-interface-public-and-private-symbols

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dangotbanned, can I check I understand correctly: having the Plugin protocol with the two top-level functions means that if future plugin authors don't read our write-up (it's cooking..) on how to create a plugin & put different function at the interface to their plugin, typing will shout at them?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

& put different function at the interface to their plugin, typing will shout at them?

I wish! 😄

No, sadly, the only user-facing reason for it is documentation.
E.g. we say something like:

The narwhals.plugins.Plugin protocol defines what we expect to find in <plugin_package_name>.__init__.py

Internally though it is what give us typing from https://github.com/narwhals-dev/narwhals/blob/7bcec6199fe83ef2ddec9c76786c9c3c49612a1f/narwhals/plugins.py#L84

        plugin: Plugin = entry_point.load()

something_undefined doesn't exist on Plugin, so we're warned when we use it 👍

image

Without that, everything is Any, so we get no help on this 👎

image

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤯 ohhh I think you've just brought my understanding of Protocols forward by a biig jump. Thanks so much!! I read documentation/tutorials but they often remain surface.



__all__ = ["Plugin", "from_native"]

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ll.27-39: here we're setting up the typing groundwork so that we can work with objects from plugins.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh I don't know why we need both a covariant (FromNativeR_co )and invariant version (FrameT) of Frame, if that's what's going on?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh I don't know why we need both a covariant (FromNativeR_co )and invariant version (FrameT) of Frame, if that's what's going on?

Ahhhh the variance part is related, but don't worry about that for now!

I've written too much lately about this 😭

These have some breadcrumbs for now:


This is the meat of the question:

why we need both FromNativeR_co and FrameT?

  • CompliantNamespace introduces the Frame requirement, but what about Series?
  • {Eager,Lazy}Namespace have more complete definitions
    • From there, we want the return type(s) of from_native

PluginNamespace therefore says:

For however CompliantNamespace uses it's first TypeVar:

  • I accept any kind of compliant frame
  • Which defaults to both eager and lazy
FrameT = TypeVar(
    "FrameT",
    bound="CompliantFrameAny",
    default="CompliantDataFrameAny | CompliantLazyFrameAny",
)
class PluginNamespace(CompliantNamespace[FrameT, Any])

Additionally, here's what I know about the return of my from_native method:

  • It could be any compliant frame, or series
  • Which defaults to everything
CompliantAny: TypeAlias = (
    "CompliantDataFrameAny | CompliantLazyFrameAny | CompliantSeriesAny"
)
"""A statically-unknown, Compliant object originating from a plugin."""

FromNativeR_co = TypeVar(
    "FromNativeR_co", bound=CompliantAny, covariant=True, default=CompliantAny
)

class PluginNamespace(CompliantNamespace[FrameT, Any], Protocol[FrameT, FromNativeR_co]):
    def from_native(self, data: Any, /) -> FromNativeR_co: ...

Important

These TypeVars happen to be the same for lazy-only backends
But pandas, polars, pyarrow have more funky stuff

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woa.. yeah it's a lot.. I will trust the process, I think this is going to be a long-term project for me to first absorb this knowledge then learn to apply it!

yield compliant_namespace.from_native(native_object)


def from_native(native_object: Any, version: Version) -> CompliantAny | None:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from_native - this is the function we'll actually call in translate. At first I was confused by the name _iter_from_native, but now I get it 😅 as that's what actually iterates! (or rather the generator it outputs can be iterated over with next.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the return statement, does next(_iter_from_native(native_object, version), None) mean that it'll try to iterate, but if there's no plugin, it'll return None?

Copy link
Collaborator Author

@dangotbanned dangotbanned Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the return statement, does next(_iter_from_native(native_object, version), None) mean that it'll try to iterate, but if there's no plugin, it'll return None?

Almost 😅

Try experimenting/extending/fixing this

Show playground

E.g. in an interactive session, try passing any of plugins_* into the functions and see what happens 🙂

from __future__ import annotations

from typing import TYPE_CHECKING, Any, TypedDict

# ruff: noqa: T201

if TYPE_CHECKING:
    from collections.abc import Iterable, Iterator, Sequence


class PluginMock(TypedDict):
    name: str
    is_native_plugin: bool


ok_1 = PluginMock(name="ok_1", is_native_plugin=True)
ok_2 = PluginMock(name="ok_2", is_native_plugin=True)
not_ok = PluginMock(name="not_ok", is_native_plugin=False)

plugins_1 = []
plugins_2 = [ok_1]
plugins_3 = [not_ok]
plugins_4 = [not_ok, ok_1]
plugins_5 = [ok_1, not_ok]
plugins_6 = [ok_1, ok_2]
plugins_7 = [ok_2, ok_1]


def example_iterator(loaded_entrypoints: Iterable[PluginMock]) -> Iterator[PluginMock]:
    for plugin in loaded_entrypoints:
        name = plugin["name"]
        is_native_plugin = plugin["is_native_plugin"]
        print(f"Checking {name!r} ...")
        if is_native_plugin:
            print(f"  Success: {name!r}")
            yield plugin
        else:
            print(f"  Fail: {name!r}")
    print("Exhausted 🥵")


def consumers(plugins: Sequence[PluginMock]) -> Any:
    a = example_iterator(plugins)
    b = next(example_iterator(plugins))
    c = next(example_iterator(plugins), "default?")
    d = list(example_iterator(plugins))
    e_it = example_iterator(plugins)
    e_1 = next(e_it)
    e_2 = next(e_it, "another default?")
    f, *g = example_iterator(plugins)
    return a, b, c, d, e_1, e_2, f, g

Relating back to (#2 (comment)) ...

What we know for sure is that None means we didn't find a plugin that could handle native_object

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nicee! will have a play now!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohhh wow this is soooo weird. It's making my eyes bleed and I needed print statements; still don't quite understand why it's happening, but at least I can see it happening:

from plugins5 onwards things get weird, for b & c we don't seem to iterate over the full list of plugins? So as long as the first plugin is ok, it stops iterating? I need to still figure out why though 😅

plugin_4 output image
plugin_5 output image

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ps: learned about extended iterable unpacking in the process :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from plugins_5 onwards things get weird, for b & c we don't seem to iterate over the full list of plugins?
So as long as the first plugin is ok, it stops iterating?
I need to still figure out why though

@ym-pett here's some docs links that may be helpful, in order of accessibility 🙂:

The key parts in (#2 (comment)) are:

  • It only yields when is_native_plugin == True
  • How many times that happens will depend on both
    • The way you consume the iterator
    • How many items pass that condition

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking back at (#2 (comment))

we don't need to change how this works any time soon 😅
But I'm curious if this description sparks any ideas on potential issues it may have?

Say for example we had 2 plugins (P1, P2) which support native_object.
Currently, P2 will never be used - because we always use the first (P1).

I think that's a bit of an issue alone.
Now, consider that P1 could still later fail during the call to _translate_if_compliant. Our problem is then that we have no means to backtrack and try any other plugin - because we bet on the first working 😳.

By using an Iterator, we could instead do something like this (later, not in this PR 😅)

from __future__ import annotations

from typing import TYPE_CHECKING, Any, TypedDict

# ruff: noqa: T201

if TYPE_CHECKING:
    from collections.abc import Iterable, Iterator, Sequence


class PluginMock(TypedDict):
    name: str
    is_native_plugin: bool


ok_1 = PluginMock(name="ok_1", is_native_plugin=True)
ok_2 = PluginMock(name="ok_2", is_native_plugin=True)
not_ok = PluginMock(name="not_ok", is_native_plugin=False)


def example_iterator(loaded_entrypoints: Iterable[PluginMock]) -> Iterator[PluginMock]:
    for plugin in loaded_entrypoints:
        name = plugin["name"]
        is_native_plugin = plugin["is_native_plugin"]
        print(f"Checking {name!r} ...")
        if is_native_plugin:
            print(f"  Success: {name!r}")
            yield plugin
        else:
            print(f"  Fail: {name!r}")
    print("Exhausted 🥵")


def translate_if_compliant(thing: Any) -> Any:
    if thing is ok_1:
        return None
    return thing


def from_native() -> Any:
    """Super verbose control-flow to make the example clearer."""
    plugins_8 = [ok_1, not_ok, ok_2]
    it = example_iterator(plugins_8)
    first = next(it, None)
    if first is not None:
        first_result = translate_if_compliant(first)
        if first_result is not None:
            return first_result
        second = next(it, None)
        if second is not None:
            second_result = translate_if_compliant(second)
            if second_result is not None:
                return second_result
    return None

Copy link
Owner

@ym-pett ym-pett Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you so much for this @dangotbanned !! Sorry I seem to have forgot to hit 'comment' again! 🤦

I need to read the links you've given me & play with this!

) -> PluginNamespace[FrameT, FromNativeR_co]: ...
def is_native(self, native_object: object, /) -> bool: ...


Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ll.65-79, familiar territory here, all good

Comment on lines +97 to +108
Returns:
If the following conditions are met
- at least 1 plugin is installed
- at least 1 installed plugin supports `type(native_object)`
Then for the **first matching plugin**, the result of the call below.
This *should* be an object accepted by a Narwhals Dataframe, Lazyframe, or Series:
plugin: Plugin
plugin.__narwhals_namespace__(version).from_native(native_object)
In all other cases, `None` is returned instead.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm I sense a rhetorical question, but I don't quite see far enough. I wondered about the line

compliant_object = plugins.from_native(native_object, version) in translate.py.

But it tries this and if it returns None, carries on with the rest of translate. (I tried it by uninstalling nw-daft & running some nw example with pandas). So that's not a problem..

Comment on lines +82 to +87
def _iter_from_native(native_object: Any, version: Version) -> Iterator[CompliantAny]:
for entry_point in _discover_entrypoints():
plugin: Plugin = entry_point.load()
if _is_native_plugin(native_object, plugin):
compliant_namespace = plugin.__narwhals_namespace__(version=version)
yield compliant_namespace.from_native(native_object)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ym-pett this is related to (#2 (comment)), and (#2 (comment))

Here is the existing section of _from_native_impl, but updated to use the new references.

I've added some comments which are mainly focused on the Iterator and None parts

Important

None can mean 3 quite different things here

def _from_native_impl(
    native_object: Any,
    *,
    pass_through: bool = False,
    eager_only: bool = False,
    eager_or_interchange_only: bool,
    series_only: bool,
    allow_series: bool | None,
    version: Version,
) -> Any | None:
    from narwhals.translate import _translate_if_compliant

    for entry_point in _discover_entrypoints():
        plugin = entry_point.load()
        if _is_native_plugin(native_object, plugin):
            compliant_namespace = plugin.__narwhals_namespace__(version=version)
            compliant_object = compliant_namespace.from_native(native_object)
            translate_result = _translate_if_compliant(
                compliant_object,
                pass_through=pass_through,
                eager_only=eager_only,
                eager_or_interchange_only=eager_or_interchange_only,
                series_only=series_only,
                allow_series=allow_series,
                version=version,
            )
            # `None` means *all of*:
            # - We have at least 1 plugin
            # - A plugin **said they could** handle `native_object`
            # - The plugin also had `CompliantNamespace.from_native`
            #   - But what that returned didn't implement `__narwhals_{dataframe,lazyframe,series}__`
            # - The caller of `nw.from_native` used `pass_through=True`
            return translate_result  # noqa: RET504

    # `None` means *either*:
    # (No plugins)
    #   - `_discover_entrypoints()` returned an empty iterable
    # (Has plugin(s))
    #   - But **couldn't** handle `native_object`
    #   - Every plugin returned `False` on `_is_native_plugin`
    return None

Copy link
Owner

@ym-pett ym-pett Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ouch, so that's actually not great, as we won't know why we're not managing to read our plugin!

I'll need to have a think about how the situation is with the new version 🤔

Comment on lines +97 to +108
Returns:
If the following conditions are met
- at least 1 plugin is installed
- at least 1 installed plugin supports `type(native_object)`
Then for the **first matching plugin**, the result of the call below.
This *should* be an object accepted by a Narwhals Dataframe, Lazyframe, or Series:
plugin: Plugin
plugin.__narwhals_namespace__(version).from_native(native_object)
In all other cases, `None` is returned instead.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, I think the examples worked, I'm a little embarrassed now that I didn't see the problem first time round: we only get a call to the plugin.__narwhals_namespace with the first matching plugin, if I've got this right, it won't care about the next plugin after that, even if it had the necessary functions?

Copy link
Owner

@ym-pett ym-pett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hitting 'approve' again because I think I've done my usual and not submitted some comments, uff.

@ym-pett ym-pett merged commit 9bbc0a6 into ym-pett:tidy_plugin_utils Sep 25, 2025
25 of 33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants