-
Notifications
You must be signed in to change notification settings - Fork 39
feat: geopandas extension; explore method #860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
lonboard/geopandas.py
Outdated
| return apply_categorical_cmap(cat_codes, temp_cmap) | ||
|
|
||
|
|
||
| def _query_name(name: str) -> CartoBasemap: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer just depending on xyzservices for the geopandas integration:
I don't think we should implement our own mapping. After all, the cartodb basemap styles are defined in xyzservices too, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I mean depending on it conditionally from this module, not as a required dependency of the package.
Also, soon, we'll support passing in an xyzservices object directly. See #494
| _QUERY_NAME_TRANSLATION = str.maketrans(dict.fromkeys("., -_/", "")) | ||
| _basemap_providers = { | ||
| "CartoDB Positron": CartoBasemap.Positron, | ||
| "CartoDB Positron No Label": CartoBasemap.PositronNoLabels, | ||
| "CartoDB Darkmatter": CartoBasemap.DarkMatter, | ||
| "CartoDB Darkmatter No Label": CartoBasemap.DarkMatterNoLabels, | ||
| "CartoDB Voyager": CartoBasemap.Voyager, | ||
| "CartoDB Voyager No Label": CartoBasemap.VoyagerNoLabels, | ||
| } | ||
| # Convert keys to lower case without spaces | ||
| _BASEMAP_PROVIDERS = { | ||
| k.translate(_QUERY_NAME_TRANSLATION).lower(): v | ||
| for k, v in _basemap_providers.items() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So by depending on xyzservices here we should be able to remove all of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree that would be ideal and ive never loved this mapping but there was no roadmap to supporting xyzservices when i started this a year ago
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even without a full-on xyzservices integration in lonboard.basemap, we can depend on xyzservices here and pass the URL on as a string
lonboard/geopandas.py
Outdated
| return apply_categorical_cmap(cat_codes, temp_cmap) | ||
|
|
||
|
|
||
| def _query_name(name: str) -> CartoBasemap: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I mean depending on it conditionally from this module, not as a required dependency of the package.
Also, soon, we'll support passing in an xyzservices object directly. See #494
lonboard/geopandas.py
Outdated
|
|
||
| def explore( # noqa: C901, PLR0912, PLR0913, PLR0915 | ||
| self, | ||
| *, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is the opposite of what I said before, but I think we should remove * so that the signature is exactly the same as the upstream GeoDataFrame.explore (even though the thought of someone passing in all these parameters positionally is 😬 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this may (shall) change in geopandas itself, allowing only column to be positional. But that is yet to be discussed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like this to be in line with future GeoPandas, so if GeoPandas is likely to add in *, then we might as well do it here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not that easy as it is a breaking change... but it would be a sensible thing to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let's keep the * here after column
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I'll still update the arguments to put them in the same order (just because) and make sure the default values are consistent. My preference would be to stick lonboard-specific stuff like wireframe and elevation at the end of the signature (assuming they'll continue to pass through to the existing explore's **kwargs), though I can remove them entirely if that's still your preference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I'll still update the arguments to put them in the same order (just because)
👍 that'll also make it easier to check what functionality is the same vs different from upstream
My preference would be to stick lonboard-specific stuff like wireframe and elevation at the end of the signature
Let's remove any Lonboard-specific stuff for now and just get the minimal integration merged. That will be easier and faster to review and merge, and then we can add Lonboard-specific stuff as a follow up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok i updated as discussed. Two small differences are that the alpha and nan_color arguments are at the end, after all the existing kwargs in the existing explore. In the folium-based version, transparency is handled py passing the opacity option to the style kwds, so alpha isn't available, and nan_color would be passed to 'color' in missing_kwds, which isn't provided here
I can drop those if we'd like. I dont have a good solution for transparency if we drop the alpha keyword. I could add a 'missing_kwds' argument that accepts color instead of exposing nan_color directly, which would mirror the folium behavior, though it would be kinda cumbersome to work with a dict jsut for that one option
lonboard/geopandas.py
Outdated
| column: str | None = None, | ||
| cmap: str | None = None, | ||
| scheme: str | None = None, | ||
| k: int | None = 6, | ||
| categorical: bool = False, | ||
| elevation: str | ArrayLike | None = None, | ||
| elevation_scale: float | None = 1, | ||
| alpha: float | None = 1, | ||
| layer_kwargs: ScatterplotLayerKwargs | ||
| | PathLayerKwargs | ||
| | PolygonLayerKwargs | ||
| | None = None, | ||
| map_kwargs: MapKwargs | None = None, | ||
| classification_kwds: dict[str, str | IntFloat | ArrayLike | bool] | None = None, | ||
| nan_color: list[int] | NDArray[np.uint8] | None = None, | ||
| color: str | None = None, | ||
| vmin: float | None = None, | ||
| vmax: float | None = None, | ||
| wireframe: bool = False, | ||
| tiles: str | None = None, | ||
| highlight: bool = False, | ||
| m: Map | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good to make this signature exactly the same as the upstream geopandas.explore signature. There are a bunch of differences:
kexists in both but has a different defaultcolorexists ingeopandas.explorebut not hereelevation,elevation_scale, etc. exist here but not ingeopandas.explore.scheme,k,highlight, etc exist in both but have a different ordering
In particular, this gets to the point of this integration and of this PR, which has been kinda unclear to me thus far.
This API is essentially "a basket of arguments to do everything and anything", which in my opinion is awful API design. It's too easy to have complex interactions between arguments. It might take a few lines instead of one to make a map with Lonboard's standard APIs, but each layer and map object is very predictable with how they interact.
We already have a function in lonboard to take generic input (including GeoDataFrames) and plot them. That's viz. It's simple, straightforward, and extensible. I don't want to have another function like viz (at least, officially part of Lonboard)
So that said, I think it's important that, if we want to merge this functionality into Lonboard, it should have exactly the same API as GeoDataFrame.explore, so that a user can change GeoDataFrame.explore to GeoDataFrame.lb.explore and it'll always work out of the box.
Another possibility is talking to geopandas about natively implementing this there.
And besides that, it's also a fine option to have this as a standalone third-party Python package, lonboard-geopandas!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i can stick it somewhere else if you prefer. Martin said he'd prefer it here not in geopandas
So that said, I think it's important that, if we want to merge this functionality into Lonboard, it should have exactly the same API as GeoDataFrame.explore, so that a user can change GeoDataFrame.explore to GeoDataFrame.lb.explore and it'll always work out of the box.
this was always my intent, except that the lonboard version would have some additional args to support lonboard's additional features (e.g. elevation, wireframe and scale). Given those additional features i'd expect people could mostly swap from the vanilla explore to this version, but not always. I think the "ability to do anything and everything" really comes from the desire to match exactly the geopandas explore function. What lonboard doesnt have is the ability to use native classification schemes, which is a pain.
color exists in both (its line 76 here). Different defaults and ordering are simple fixes if you do want this here. Personally i think this would be much easier to have in lonboard itself for discoverability and less need for any additional package, especially since its only a little bit of code, but again, if you find it too burdensome i can find somewhere else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think as a first implementation, I'd leave off any additional Lonboard-specific features and keep it exactly the same as the upstream GeoPandas signature, with the same ordering.
We should allow all the same parameters, and if one doesn't make sense in Lonboard, we should print a warning, but otherwise make sure something still renders.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still tepid about maintenance of this.
One idea is seeing about refactoring some of the upstream GeoPandas code, so that it isn't one big monolithic function. It would be nice if we could call geopandas.explore._explore_prep to manage a lot of the data prep that isn't related to the actual plotting in folium. Then we wouldn't have to duplicate all of the internals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i get it; i'm fine with whatever you choose. I can all but guarantee geopandas has no bandwidth to look at their internals, but on the slight off-chance, i'll let @martinfleis weigh in. I can get rid of the lonboard-specific arguments if you prefer, though i do think having them available is pretty handy. One argument in favor of keeping them around is the vanilla explore already accepts catchall keywords, so it would always work to switch between functions, even with the extra params
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another thing that's tough about trying to do the perfect 1:1 mapping with the existing explore function is that folium has a lot of stuff that works differently than lonboard. While a lot of the (admittedly ugly) code in this PR tries to handle those differences appropriately and stick stuff 'where it belongs', there would still be a lot of folium-specific things that don't make sense here, e.g. highlight, popup, attr, marker_type, marker_kwds, legend_kwds, highlight_kwds, style_kwds.
That is, lonboard doesn't have legends or attribution; 'marker_kwds' and 'style_kwds' are instead passed to their layer-specific analogs (like PolygonLayerKwargs); lonboard has side_panel versus folium's popup; sometimes the nomenclature is different (tooltip vs show_tooltip). In some cases, those are fairly easy to shoehorn, but in others (like side_panel vs popup) shooting for perfect parity probably is not an ideal goal.
A small handful of related thoughts:
As both a user and an instructor, I'd find a lonboard-based explore-like function (that mostly works the same as the existing one--even if only for the classification schemes, which are indispensable for anything human geography-related) to be incredibly valuable, especially if it existed somewhere central like geopandas or lonboard. Indeed, I'd probably never bother with the vanilla explore anymore, given all the benefits of lonboard over folium. I'm confident that view would be shared widely. But if neither geopandas nor lonboard want that functionality, i'll find somewhere that's conveninent for me--but then only my students and a handful of colleagues will know where to find it and that feels like a net loss to the ecosystem. I'd be more than happy to contribute to ongoing maintenance, but the infrastructure overhead of putting together yet another package, just for this, probably isn't tenable. I doubt that changes anyone else's calculus, but just figured i'd lay it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another possibility is talking to geopandas about natively implementing this there.
We did and it is a hard no, sorry. That discussion was what kickstarted this PR in the first place.
One idea is seeing about refactoring some of the upstream GeoPandas code, so that it isn't one big monolithic function. It would be nice if we could call geopandas.explore._explore_prep to manage a lot of the data prep that isn't related to the actual plotting in folium. Then we wouldn't have to duplicate all of the internals.
We can potentially talk about this but I would need to know which parts of the code you'd like to move out to a separate, presumable public, function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the code here is already dramatically simpler than explore which clocks in over 1k lines. That's mostly because this implementation uses mapclassify more efficiently and lonboard already has its own color application functions (and doesn't need to write to json, deal with branca, or generate a legend). Maybe some of the categorical or basemap handling would be useful once #494 is ready, but that would probably be the extent of it
|
I think it's probably worth having this in a A couple questions on the mechanics of how this works:
|
yes
I don't think so.
no strong feelings,
No clue. |
Good point. Let's use |
|
for some reason the precommit hook removes the |
|
I think you can put |
| """ | ||
| additional loboard-specific arguments to consider | ||
|
|
||
| # only polygons have z | ||
| if ["Polygon", "MultiPolygon"] in gdf.geometry.geom_type.unique(): | ||
| layer_kwargs["get_elevation"] = elevation | ||
| layer_kwargs["elevation_scale"] = elevation_scale | ||
| layer_kwargs["wireframe"] = wireframe | ||
| if isinstance(elevation, str): | ||
| if elevation in gdf.columns: | ||
| elevation: Series = gdf[elevation] | ||
| else: | ||
| raise ValueError( | ||
| f"the designated height column {elevation} is not in the dataframe", | ||
| ) | ||
| if not pd.api.types.is_numeric_dtype(elevation): | ||
| raise ValueError("elevation must be a numeric data type") | ||
| if elevation is not None: | ||
| layer_kwargs["extruded"] = True | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a comment...? If we don't plan to use the code we should remove it
| layer_kwargs["extruded"] = True | ||
| """ | ||
|
|
||
| new_m: Map = viz( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this type hint can be inferred
| new_m: Map = viz( | |
| new_m = viz( |
| def _query_name(name: str) -> CartoStyle: | ||
| """Return basemap URL based on the name query (mimicking behavior from xyzservices). | ||
|
|
||
| Returns a matching basemap from name contains the same letters in the same | ||
| order as the provider's name irrespective of the letter case, spaces, dashes | ||
| and other characters. See examples for details. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| name : str | ||
| Name of the tile provider. Formatting does not matter. | ||
|
|
||
| Returns | ||
| ------- | ||
| match: lonboard.basemap | ||
|
|
||
| Examples | ||
| -------- | ||
| >>> import xyzservices.providers as xyz | ||
|
|
||
| All these queries return the same ``CartoDB.Positron`` TileProvider: | ||
|
|
||
| >>> xyz._query_name("CartoDB Positron") | ||
| >>> xyz._query_name("cartodbpositron") | ||
| >>> xyz._query_name("cartodb-positron") | ||
| >>> xyz._query_name("carto db/positron") | ||
| >>> xyz._query_name("CARTO_DB_POSITRON") | ||
| >>> xyz._query_name("CartoDB.Positron") | ||
|
|
||
| """ | ||
| name_clean = name.translate(_QUERY_NAME_TRANSLATION).lower() | ||
| if name_clean in _BASEMAP_PROVIDERS: | ||
| return _BASEMAP_PROVIDERS[name_clean] | ||
|
|
||
| raise ValueError(f"No matching provider found for the query '{name}'.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can remove this in favor of xyzservices
supercedes #728