-
Notifications
You must be signed in to change notification settings - Fork 0
refactor: plugins uno reverse card
#2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There's only one module in the package, we can always switch back to a package if needed later
Got a few more typing/docs tweaks to follow
I used it here (#2978 (comment)) ... but forgot by the next day 🤦♂️ (#2978 (comment))
|
ohh this is brilliant @dangotbanned! I will pull to my local and have a play with it and talk myself through the changes! (I think I need to add you as a collaborator to my forked repo, will try that) |
| Returns: | ||
| If the following conditions are met | ||
| - at least 1 plugin is installed | ||
| - at least 1 installed plugin supports `type(native_object)` | ||
| Then for the **first matching plugin**, the result of the call below. | ||
| This *should* be an object accepted by a Narwhals Dataframe, Lazyframe, or Series: | ||
| plugin: Plugin | ||
| plugin.__narwhals_namespace__(version).from_native(native_object) | ||
| In all other cases, `None` is returned instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this to describe the current behavior, but I'm not saying it should be the end goal.
(more qualifying) we don't need to change how this works any time soon 😅
But I'm curious if this description sparks any ideas on potential issues it may have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I sense a rhetorical question, but I don't quite see far enough. I wondered about the line
compliant_object = plugins.from_native(native_object, version) in translate.py.
But it tries this and if it returns None, carries on with the rest of translate. (I tried it by uninstalling nw-daft & running some nw example with pandas). So that's not a problem..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow, I think the examples worked, I'm a little embarrassed now that I didn't see the problem first time round: we only get a call to the plugin.__narwhals_namespace with the first matching plugin, if I've got this right, it won't care about the next plugin after that, even if it had the necessary functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to be embarrassed!
if I've got this right, it won't care about the next plugin after that, even if it had the necessary functions?
💯
| ) | ||
| from narwhals.utils import Version | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also going to use this review to talk myself through the code and check my understanding with you, hope that's ok.
is __all__ below for protecting wildcard imports? i.e. stopping non-public ones getting imported too? (commenting out line doesn't break code)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also going to use this review to talk myself through the code and check my understanding with you, hope that's ok.
No problem at __all__ 😉
is
__all__below for protecting wildcard imports? i.e. stopping non-public ones getting imported too? (commenting out line doesn't break code)
Yeah that is one of the reasons
The rest is documented in: https://typing.python.org/en/latest/spec/distributing.html#library-interface-public-and-private-symbols
from_nativeis the only one we are using in another modulePluginis intended to be a guide for what extension authors should put in their module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dangotbanned, can I check I understand correctly: having the Plugin protocol with the two top-level functions means that if future plugin authors don't read our write-up (it's cooking..) on how to create a plugin & put different function at the interface to their plugin, typing will shout at them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
& put different function at the interface to their plugin, typing will shout at them?
I wish! 😄
No, sadly, the only user-facing reason for it is documentation.
E.g. we say something like:
The
narwhals.plugins.Pluginprotocol defines what we expect to find in<plugin_package_name>.__init__.py
Internally though it is what give us typing from https://github.com/narwhals-dev/narwhals/blob/7bcec6199fe83ef2ddec9c76786c9c3c49612a1f/narwhals/plugins.py#L84
plugin: Plugin = entry_point.load()something_undefined doesn't exist on Plugin, so we're warned when we use it 👍
Without that, everything is Any, so we get no help on this 👎
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤯 ohhh I think you've just brought my understanding of Protocols forward by a biig jump. Thanks so much!! I read documentation/tutorials but they often remain surface.
|
|
||
|
|
||
| __all__ = ["Plugin", "from_native"] | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ll.27-39: here we're setting up the typing groundwork so that we can work with objects from plugins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh I don't know why we need both a covariant (FromNativeR_co )and invariant version (FrameT) of Frame, if that's what's going on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh I don't know why we need both a covariant (FromNativeR_co )and invariant version (FrameT) of Frame, if that's what's going on?
Ahhhh the variance part is related, but don't worry about that for now!
I've written too much lately about this 😭
These have some breadcrumbs for now:
- https://github.com/narwhals-dev/narwhals/pull/3114/files#r2333856437
- "chore: simplify parsing expressions in group-by"" narwhals-dev/narwhals#3110
This is the meat of the question:
why we need both
FromNativeR_coandFrameT?
CompliantNamespaceintroduces theFramerequirement, but what aboutSeries?{Eager,Lazy}Namespacehave more complete definitions- From there, we want the return type(s) of
from_native
- From there, we want the return type(s) of
PluginNamespace therefore says:
For however CompliantNamespace uses it's first TypeVar:
- I accept any kind of compliant frame
- Which defaults to both eager and lazy
FrameT = TypeVar(
"FrameT",
bound="CompliantFrameAny",
default="CompliantDataFrameAny | CompliantLazyFrameAny",
)
class PluginNamespace(CompliantNamespace[FrameT, Any])Additionally, here's what I know about the return of my from_native method:
- It could be any compliant frame, or series
- Which defaults to everything
CompliantAny: TypeAlias = (
"CompliantDataFrameAny | CompliantLazyFrameAny | CompliantSeriesAny"
)
"""A statically-unknown, Compliant object originating from a plugin."""
FromNativeR_co = TypeVar(
"FromNativeR_co", bound=CompliantAny, covariant=True, default=CompliantAny
)
class PluginNamespace(CompliantNamespace[FrameT, Any], Protocol[FrameT, FromNativeR_co]):
def from_native(self, data: Any, /) -> FromNativeR_co: ...Important
These TypeVars happen to be the same for lazy-only backends
But pandas, polars, pyarrow have more funky stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
woa.. yeah it's a lot.. I will trust the process, I think this is going to be a long-term project for me to first absorb this knowledge then learn to apply it!
| yield compliant_namespace.from_native(native_object) | ||
|
|
||
|
|
||
| def from_native(native_object: Any, version: Version) -> CompliantAny | None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from_native - this is the function we'll actually call in translate. At first I was confused by the name _iter_from_native, but now I get it 😅 as that's what actually iterates! (or rather the generator it outputs can be iterated over with next.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the return statement, does next(_iter_from_native(native_object, version), None) mean that it'll try to iterate, but if there's no plugin, it'll return None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the return statement, does
next(_iter_from_native(native_object, version), None)mean that it'll try to iterate, but if there's no plugin, it'll return None?
Almost 😅
Try experimenting/extending/fixing this
Show playground
E.g. in an interactive session, try passing any of plugins_* into the functions and see what happens 🙂
from __future__ import annotations
from typing import TYPE_CHECKING, Any, TypedDict
# ruff: noqa: T201
if TYPE_CHECKING:
from collections.abc import Iterable, Iterator, Sequence
class PluginMock(TypedDict):
name: str
is_native_plugin: bool
ok_1 = PluginMock(name="ok_1", is_native_plugin=True)
ok_2 = PluginMock(name="ok_2", is_native_plugin=True)
not_ok = PluginMock(name="not_ok", is_native_plugin=False)
plugins_1 = []
plugins_2 = [ok_1]
plugins_3 = [not_ok]
plugins_4 = [not_ok, ok_1]
plugins_5 = [ok_1, not_ok]
plugins_6 = [ok_1, ok_2]
plugins_7 = [ok_2, ok_1]
def example_iterator(loaded_entrypoints: Iterable[PluginMock]) -> Iterator[PluginMock]:
for plugin in loaded_entrypoints:
name = plugin["name"]
is_native_plugin = plugin["is_native_plugin"]
print(f"Checking {name!r} ...")
if is_native_plugin:
print(f" Success: {name!r}")
yield plugin
else:
print(f" Fail: {name!r}")
print("Exhausted 🥵")
def consumers(plugins: Sequence[PluginMock]) -> Any:
a = example_iterator(plugins)
b = next(example_iterator(plugins))
c = next(example_iterator(plugins), "default?")
d = list(example_iterator(plugins))
e_it = example_iterator(plugins)
e_1 = next(e_it)
e_2 = next(e_it, "another default?")
f, *g = example_iterator(plugins)
return a, b, c, d, e_1, e_2, f, gRelating back to (#2 (comment)) ...
What we know for sure is that None means we didn't find a plugin that could handle native_object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nicee! will have a play now!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohhh wow this is soooo weird. It's making my eyes bleed and I needed print statements; still don't quite understand why it's happening, but at least I can see it happening:
from plugins5 onwards things get weird, for b & c we don't seem to iterate over the full list of plugins? So as long as the first plugin is ok, it stops iterating? I need to still figure out why though 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ps: learned about extended iterable unpacking in the process :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from
plugins_5onwards things get weird, for b & c we don't seem to iterate over the full list of plugins?
So as long as the first plugin is ok, it stops iterating?
I need to still figure out why though
@ym-pett here's some docs links that may be helpful, in order of accessibility 🙂:
- python/howto/functional/iterators
- python/tutorial/classes/iterators
- python/library/functions/next
- python/reference/expressions/generator.__next__
The key parts in (#2 (comment)) are:
- It only
yields whenis_native_plugin == True - How many times that happens will depend on both
- The way you consume the iterator
- How many items pass that condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking back at (#2 (comment))
we don't need to change how this works any time soon 😅
But I'm curious if this description sparks any ideas on potential issues it may have?
Say for example we had 2 plugins (P1, P2) which support native_object.
Currently, P2 will never be used - because we always use the first (P1).
I think that's a bit of an issue alone.
Now, consider that P1 could still later fail during the call to _translate_if_compliant. Our problem is then that we have no means to backtrack and try any other plugin - because we bet on the first working 😳.
By using an Iterator, we could instead do something like this (later, not in this PR 😅)
from __future__ import annotations
from typing import TYPE_CHECKING, Any, TypedDict
# ruff: noqa: T201
if TYPE_CHECKING:
from collections.abc import Iterable, Iterator, Sequence
class PluginMock(TypedDict):
name: str
is_native_plugin: bool
ok_1 = PluginMock(name="ok_1", is_native_plugin=True)
ok_2 = PluginMock(name="ok_2", is_native_plugin=True)
not_ok = PluginMock(name="not_ok", is_native_plugin=False)
def example_iterator(loaded_entrypoints: Iterable[PluginMock]) -> Iterator[PluginMock]:
for plugin in loaded_entrypoints:
name = plugin["name"]
is_native_plugin = plugin["is_native_plugin"]
print(f"Checking {name!r} ...")
if is_native_plugin:
print(f" Success: {name!r}")
yield plugin
else:
print(f" Fail: {name!r}")
print("Exhausted 🥵")
def translate_if_compliant(thing: Any) -> Any:
if thing is ok_1:
return None
return thing
def from_native() -> Any:
"""Super verbose control-flow to make the example clearer."""
plugins_8 = [ok_1, not_ok, ok_2]
it = example_iterator(plugins_8)
first = next(it, None)
if first is not None:
first_result = translate_if_compliant(first)
if first_result is not None:
return first_result
second = next(it, None)
if second is not None:
second_result = translate_if_compliant(second)
if second_result is not None:
return second_result
return NoneThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you so much for this @dangotbanned !! Sorry I seem to have forgot to hit 'comment' again! 🤦
I need to read the links you've given me & play with this!
| ) -> PluginNamespace[FrameT, FromNativeR_co]: ... | ||
| def is_native(self, native_object: object, /) -> bool: ... | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ll.65-79, familiar territory here, all good
| Returns: | ||
| If the following conditions are met | ||
| - at least 1 plugin is installed | ||
| - at least 1 installed plugin supports `type(native_object)` | ||
| Then for the **first matching plugin**, the result of the call below. | ||
| This *should* be an object accepted by a Narwhals Dataframe, Lazyframe, or Series: | ||
| plugin: Plugin | ||
| plugin.__narwhals_namespace__(version).from_native(native_object) | ||
| In all other cases, `None` is returned instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I sense a rhetorical question, but I don't quite see far enough. I wondered about the line
compliant_object = plugins.from_native(native_object, version) in translate.py.
But it tries this and if it returns None, carries on with the rest of translate. (I tried it by uninstalling nw-daft & running some nw example with pandas). So that's not a problem..
| def _iter_from_native(native_object: Any, version: Version) -> Iterator[CompliantAny]: | ||
| for entry_point in _discover_entrypoints(): | ||
| plugin: Plugin = entry_point.load() | ||
| if _is_native_plugin(native_object, plugin): | ||
| compliant_namespace = plugin.__narwhals_namespace__(version=version) | ||
| yield compliant_namespace.from_native(native_object) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ym-pett this is related to (#2 (comment)), and (#2 (comment))
Here is the existing section of _from_native_impl, but updated to use the new references.
I've added some comments which are mainly focused on the Iterator and None parts
Important
None can mean 3 quite different things here
def _from_native_impl(
native_object: Any,
*,
pass_through: bool = False,
eager_only: bool = False,
eager_or_interchange_only: bool,
series_only: bool,
allow_series: bool | None,
version: Version,
) -> Any | None:
from narwhals.translate import _translate_if_compliant
for entry_point in _discover_entrypoints():
plugin = entry_point.load()
if _is_native_plugin(native_object, plugin):
compliant_namespace = plugin.__narwhals_namespace__(version=version)
compliant_object = compliant_namespace.from_native(native_object)
translate_result = _translate_if_compliant(
compliant_object,
pass_through=pass_through,
eager_only=eager_only,
eager_or_interchange_only=eager_or_interchange_only,
series_only=series_only,
allow_series=allow_series,
version=version,
)
# `None` means *all of*:
# - We have at least 1 plugin
# - A plugin **said they could** handle `native_object`
# - The plugin also had `CompliantNamespace.from_native`
# - But what that returned didn't implement `__narwhals_{dataframe,lazyframe,series}__`
# - The caller of `nw.from_native` used `pass_through=True`
return translate_result # noqa: RET504
# `None` means *either*:
# (No plugins)
# - `_discover_entrypoints()` returned an empty iterable
# (Has plugin(s))
# - But **couldn't** handle `native_object`
# - Every plugin returned `False` on `_is_native_plugin`
return NoneThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ouch, so that's actually not great, as we won't know why we're not managing to read our plugin!
I'll need to have a think about how the situation is with the new version 🤔
| Returns: | ||
| If the following conditions are met | ||
| - at least 1 plugin is installed | ||
| - at least 1 installed plugin supports `type(native_object)` | ||
| Then for the **first matching plugin**, the result of the call below. | ||
| This *should* be an object accepted by a Narwhals Dataframe, Lazyframe, or Series: | ||
| plugin: Plugin | ||
| plugin.__narwhals_namespace__(version).from_native(native_object) | ||
| In all other cases, `None` is returned instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow, I think the examples worked, I'm a little embarrassed now that I didn't see the problem first time round: we only get a call to the plugin.__narwhals_namespace with the first matching plugin, if I've got this right, it won't care about the next plugin after that, even if it had the necessary functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hitting 'approve' again because I think I've done my usual and not submitted some comments, uff.


Related issues
If you have comments or can explain your changes, please do so below
Thanks for the ping @ym-pett!
I had a few commits, but I forgot to check permissions first 😅
Show screenshot
I'm guess there's something in docs/permission-levels-for-a-personal-account-repository?
Anyway, here I am ...
pluginsuno reverse card #2) to merge intoPlease feel free to fire away any comments/questions/suggestions before merging 🙂