Skip to content
Open
Show file tree
Hide file tree
Changes from 19 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3,741 changes: 3,741 additions & 0 deletions examples/explore.ipynb

Large diffs are not rendered by default.

310 changes: 310 additions & 0 deletions lonboard/geopandas.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,310 @@
from __future__ import annotations

from typing import TYPE_CHECKING

import geopandas as gpd
import numpy as np
import pandas as pd
from numpy import uint8

from lonboard import Map, viz
from lonboard.basemap import CartoBasemap
from lonboard.colormap import apply_categorical_cmap, apply_continuous_cmap

if TYPE_CHECKING:
from numpy.typing import ArrayLike, NDArray
from pandas.core.series import Series

from lonboard.types.layer import (
IntFloat,
PathLayerKwargs,
PolygonLayerKwargs,
ScatterplotLayerKwargs,
)
from lonboard.types.map import MapKwargs

__all__ = ["LonboardAccessor"]

_QUERY_NAME_TRANSLATION = str.maketrans(dict.fromkeys("., -_/", ""))
_basemap_providers = {
"CartoDB Positron": CartoBasemap.Positron,
"CartoDB Positron No Label": CartoBasemap.PositronNoLabels,
"CartoDB Darkmatter": CartoBasemap.DarkMatter,
"CartoDB Darkmatter No Label": CartoBasemap.DarkMatterNoLabels,
"CartoDB Voyager": CartoBasemap.Voyager,
"CartoDB Voyager No Label": CartoBasemap.VoyagerNoLabels,
}
# Convert keys to lower case without spaces
_BASEMAP_PROVIDERS = {
k.translate(_QUERY_NAME_TRANSLATION).lower(): v
for k, v in _basemap_providers.items()
}
Comment on lines 27 to 40
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So by depending on xyzservices here we should be able to remove all of this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree that would be ideal and ive never loved this mapping but there was no roadmap to supporting xyzservices when i started this a year ago

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even without a full-on xyzservices integration in lonboard.basemap, we can depend on xyzservices here and pass the URL on as a string



@pd.api.extensions.register_dataframe_accessor("lb")
class LonboardAccessor:
"""Geopandas Extension class to provide the `explore` method."""

def __init__(self, pandas_obj: gpd.GeoDataFrame) -> None:
"""Initialize geopandas extension."""
self._validate(pandas_obj)
self._obj = pandas_obj

@staticmethod
def _validate(obj: gpd.GeoDataFrame) -> None:
if not isinstance(obj, gpd.GeoDataFrame):
raise TypeError("Input Must be a geodataframe")

def explore( # noqa: C901, PLR0912, PLR0913, PLR0915
self,
*,
Copy link
Member

@kylebarron kylebarron Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is the opposite of what I said before, but I think we should remove * so that the signature is exactly the same as the upstream GeoDataFrame.explore (even though the thought of someone passing in all these parameters positionally is 😬 )

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this may (shall) change in geopandas itself, allowing only column to be positional. But that is yet to be discussed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like this to be in line with future GeoPandas, so if GeoPandas is likely to add in *, then we might as well do it here

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not that easy as it is a breaking change... but it would be a sensible thing to do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's keep the * here after column

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'll still update the arguments to put them in the same order (just because) and make sure the default values are consistent. My preference would be to stick lonboard-specific stuff like wireframe and elevation at the end of the signature (assuming they'll continue to pass through to the existing explore's **kwargs), though I can remove them entirely if that's still your preference?

Copy link
Member

@kylebarron kylebarron Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'll still update the arguments to put them in the same order (just because)

👍 that'll also make it easier to check what functionality is the same vs different from upstream

My preference would be to stick lonboard-specific stuff like wireframe and elevation at the end of the signature

Let's remove any Lonboard-specific stuff for now and just get the minimal integration merged. That will be easier and faster to review and merge, and then we can add Lonboard-specific stuff as a follow up

Copy link
Contributor Author

@knaaptime knaaptime Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok i updated as discussed. Two small differences are that the alpha and nan_color arguments are at the end, after all the existing kwargs in the existing explore. In the folium-based version, transparency is handled py passing the opacity option to the style kwds, so alpha isn't available, and nan_color would be passed to 'color' in missing_kwds, which isn't provided here

I can drop those if we'd like. I dont have a good solution for transparency if we drop the alpha keyword. I could add a 'missing_kwds' argument that accepts color instead of exposing nan_color directly, which would mirror the folium behavior, though it would be kinda cumbersome to work with a dict jsut for that one option

column: str | None = None,
cmap: str | None = None,
scheme: str | None = None,
k: int | None = 6,
categorical: bool = False,
elevation: str | ArrayLike | None = None,
elevation_scale: float | None = 1,
alpha: float | None = 1,
layer_kwargs: ScatterplotLayerKwargs
| PathLayerKwargs
| PolygonLayerKwargs
| None = None,
map_kwargs: MapKwargs | None = None,
classification_kwds: dict[str, str | IntFloat | ArrayLike | bool] | None = None,
nan_color: list[int] | NDArray[np.uint8] | None = None,
color: str | None = None,
vmin: float | None = None,
vmax: float | None = None,
wireframe: bool = False,
tiles: str | None = None,
highlight: bool = False,
m: Map | None = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to make this signature exactly the same as the upstream geopandas.explore signature. There are a bunch of differences:

  • k exists in both but has a different default
  • color exists in geopandas.explore but not here
  • elevation, elevation_scale, etc. exist here but not in geopandas.explore.
  • scheme, k, highlight, etc exist in both but have a different ordering

In particular, this gets to the point of this integration and of this PR, which has been kinda unclear to me thus far.

This API is essentially "a basket of arguments to do everything and anything", which in my opinion is awful API design. It's too easy to have complex interactions between arguments. It might take a few lines instead of one to make a map with Lonboard's standard APIs, but each layer and map object is very predictable with how they interact.

We already have a function in lonboard to take generic input (including GeoDataFrames) and plot them. That's viz. It's simple, straightforward, and extensible. I don't want to have another function like viz (at least, officially part of Lonboard)


So that said, I think it's important that, if we want to merge this functionality into Lonboard, it should have exactly the same API as GeoDataFrame.explore, so that a user can change GeoDataFrame.explore to GeoDataFrame.lb.explore and it'll always work out of the box.

Another possibility is talking to geopandas about natively implementing this there.

And besides that, it's also a fine option to have this as a standalone third-party Python package, lonboard-geopandas!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can stick it somewhere else if you prefer. Martin said he'd prefer it here not in geopandas

So that said, I think it's important that, if we want to merge this functionality into Lonboard, it should have exactly the same API as GeoDataFrame.explore, so that a user can change GeoDataFrame.explore to GeoDataFrame.lb.explore and it'll always work out of the box.

this was always my intent, except that the lonboard version would have some additional args to support lonboard's additional features (e.g. elevation, wireframe and scale). Given those additional features i'd expect people could mostly swap from the vanilla explore to this version, but not always. I think the "ability to do anything and everything" really comes from the desire to match exactly the geopandas explore function. What lonboard doesnt have is the ability to use native classification schemes, which is a pain.

color exists in both (its line 76 here). Different defaults and ordering are simple fixes if you do want this here. Personally i think this would be much easier to have in lonboard itself for discoverability and less need for any additional package, especially since its only a little bit of code, but again, if you find it too burdensome i can find somewhere else

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as a first implementation, I'd leave off any additional Lonboard-specific features and keep it exactly the same as the upstream GeoPandas signature, with the same ordering.

We should allow all the same parameters, and if one doesn't make sense in Lonboard, we should print a warning, but otherwise make sure something still renders.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still tepid about maintenance of this.

One idea is seeing about refactoring some of the upstream GeoPandas code, so that it isn't one big monolithic function. It would be nice if we could call geopandas.explore._explore_prep to manage a lot of the data prep that isn't related to the actual plotting in folium. Then we wouldn't have to duplicate all of the internals.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i get it; i'm fine with whatever you choose. I can all but guarantee geopandas has no bandwidth to look at their internals, but on the slight off-chance, i'll let @martinfleis weigh in. I can get rid of the lonboard-specific arguments if you prefer, though i do think having them available is pretty handy. One argument in favor of keeping them around is the vanilla explore already accepts catchall keywords, so it would always work to switch between functions, even with the extra params

Screenshot 2025-10-15 at 3 41 35 PM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another thing that's tough about trying to do the perfect 1:1 mapping with the existing explore function is that folium has a lot of stuff that works differently than lonboard. While a lot of the (admittedly ugly) code in this PR tries to handle those differences appropriately and stick stuff 'where it belongs', there would still be a lot of folium-specific things that don't make sense here, e.g. highlight, popup, attr, marker_type, marker_kwds, legend_kwds, highlight_kwds, style_kwds.

That is, lonboard doesn't have legends or attribution; 'marker_kwds' and 'style_kwds' are instead passed to their layer-specific analogs (like PolygonLayerKwargs); lonboard has side_panel versus folium's popup; sometimes the nomenclature is different (tooltip vs show_tooltip). In some cases, those are fairly easy to shoehorn, but in others (like side_panel vs popup) shooting for perfect parity probably is not an ideal goal.

A small handful of related thoughts:

As both a user and an instructor, I'd find a lonboard-based explore-like function (that mostly works the same as the existing one--even if only for the classification schemes, which are indispensable for anything human geography-related) to be incredibly valuable, especially if it existed somewhere central like geopandas or lonboard. Indeed, I'd probably never bother with the vanilla explore anymore, given all the benefits of lonboard over folium. I'm confident that view would be shared widely. But if neither geopandas nor lonboard want that functionality, i'll find somewhere that's conveninent for me--but then only my students and a handful of colleagues will know where to find it and that feels like a net loss to the ecosystem. I'd be more than happy to contribute to ongoing maintenance, but the infrastructure overhead of putting together yet another package, just for this, probably isn't tenable. I doubt that changes anyone else's calculus, but just figured i'd lay it out.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possibility is talking to geopandas about natively implementing this there.

We did and it is a hard no, sorry. That discussion was what kickstarted this PR in the first place.

One idea is seeing about refactoring some of the upstream GeoPandas code, so that it isn't one big monolithic function. It would be nice if we could call geopandas.explore._explore_prep to manage a lot of the data prep that isn't related to the actual plotting in folium. Then we wouldn't have to duplicate all of the internals.

We can potentially talk about this but I would need to know which parts of the code you'd like to move out to a separate, presumable public, function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code here is already dramatically simpler than explore which clocks in over 1k lines. That's mostly because this implementation uses mapclassify more efficiently and lonboard already has its own color application functions (and doesn't need to write to json, deal with branca, or generate a legend). Maybe some of the categorical or basemap handling would be useful once #494 is ready, but that would probably be the extent of it

) -> Map:
"""Explore a dataframe using lonboard and deckgl.

Keyword Args:
column : Name of column on dataframe to visualize on map.
cmap : Name of matplotlib colormap to use.
scheme : Name of a classification scheme defined by mapclassify.Classifier.
k : Number of classes to generate. Defaults to 6.
categorical : Whether the data should be treated as categorical or
continuous.
elevation : Name of column on the dataframe used to extrude each geometry or
an array-like in the same order as observations. Defaults to None.
elevation_scale : Constant scaler multiplied by elevation value.
alpha : Alpha (opacity) parameter in the range (0,1) passed to
mapclassify.util.get_color_array.
layer_kwargs : Additional keyword arguments passed to lonboard.viz layer
arguments (either `polygon_kwargs`, `scatterplot_kwargs`, or `path_kwargs`,
depending on input geometry type).
map_kwargs : Additional keyword arguments passed to lonboard.viz map_kwargs.
classification_kwds : Additional keyword arguments passed to
`mapclassify.classify`.
nan_color : Color used to shade NaN observations formatted as an RGBA list.
Defaults to [255, 255, 255, 255]. If no alpha channel is passed it is
assumed to be 255.
color : single or array of colors passed to Layer.get_fill_color
or a lonboard.basemap object, or a string to a maplibre style basemap.
vmin : Minimum value for color mapping.
vmax : Maximum value for color mapping.
wireframe : Whether to use wireframe styling in deckgl.
tiles : Either a known string {"CartoDB Positron",
"CartoDB Positron No Label", "CartoDB Darkmatter",
"CartoDB Darkmatter No Label", "CartoDB Voyager",
"CartoDB Voyager No Label"}
highlight : Whether to highlight each feature on mouseover (passed to
lonboard.Layer's auto_highlight). Defaults to False.
m: An existing Map object to plot onto.

Returns:
lonboard.Map
a lonboard map with geodataframe included as a Layer object.

"""
gdf = self._obj

if map_kwargs is None:
map_kwargs = {}
if classification_kwds is None:
classification_kwds = {}
if layer_kwargs is None:
layer_kwargs = {}
if isinstance(elevation, str):
if elevation in gdf.columns:
elevation: Series = gdf[elevation]
else:
raise ValueError(
f"the designated height column {elevation} is not in the dataframe",
)
if not pd.api.types.is_numeric_dtype(elevation):
raise ValueError("elevation must be a numeric data type")
if elevation is not None:
layer_kwargs["extruded"] = True
if nan_color is None:
nan_color = [255, 255, 255, 255]
if not pd.api.types.is_list_like(nan_color):
raise ValueError("nan_color must be an iterable of 3 or 4 values")
if len(nan_color) != 4:
if len(nan_color) == 3:
nan_color = np.append(nan_color, [255])
else:
raise ValueError("nan_color must be an iterable of 3 or 4 values")

# only polygons have z
if ["Polygon", "MultiPolygon"] in gdf.geometry.geom_type.unique():
layer_kwargs["get_elevation"] = elevation
layer_kwargs["elevation_scale"] = elevation_scale
layer_kwargs["wireframe"] = wireframe
layer_kwargs["auto_highlight"] = highlight

line = False # set color of lines, not fill_color
if ["LineString", "MultiLineString"] in gdf.geometry.geom_type.unique():
line = True
if color:
if line:
layer_kwargs["get_color"] = color
else:
layer_kwargs["get_fill_color"] = color
if column is not None:
try:
from matplotlib import colormaps
except ImportError as e:
raise ImportError(
"you must have matplotlib installed to style by a column",
) from e

if column not in gdf.columns:
raise ValueError(
f"the designated column {column} is not in the dataframe",
)
if gdf[column].dtype in ["O", "category"]:
categorical = True
if cmap is not None and cmap not in colormaps:
raise ValueError(
f"`cmap` must be one of {list(colormaps.keys())} but {cmap} was passed",
)
if cmap is None:
cmap = "tab20" if categorical else "viridis"
if categorical:
color_array = _get_categorical_cmap(gdf[column], cmap, nan_color, alpha)
elif scheme is None:
if vmin is None:
vmin: int | float = np.nanmin(gdf[column])
if vmax is None:
vmax: int | float = np.nanmax(gdf[column])
# minmax scale the column first, matplotlib needs 0-1
transformed = (gdf[column] - vmin) / (vmax - vmin)
color_array = apply_continuous_cmap(
values=transformed,
cmap=colormaps[cmap],
alpha=alpha,
)
else:
try:
from mapclassify._classify_API import _classifiers
from mapclassify.util import get_color_array

_klasses = list(_classifiers.keys())
_klasses.append("userdefined")
except ImportError as e:
raise ImportError(
"You must have the `mapclassify` package installed to use the `scheme` keyword",
) from e
if scheme.replace("_", "") not in _klasses:
raise ValueError(
f"The classification scheme must be a valid mapclassify classifier in {_klasses}, but {scheme} was passed instead",
)
if k is not None and "k" in classification_kwds:
# k passed directly takes precedence
classification_kwds.pop("k")
color_array = get_color_array(
gdf[column],
scheme=scheme,
k=k,
cmap=cmap,
alpha=alpha,
nan_color=nan_color,
**classification_kwds,
)

if line:
layer_kwargs["get_color"] = color_array

else:
layer_kwargs["get_fill_color"] = color_array
if tiles:
map_kwargs["basemap_style"] = _query_name(tiles)
new_m: Map = viz(
Copy link
Member

@kylebarron kylebarron Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this type hint can be inferred

Suggested change
new_m: Map = viz(
new_m = viz(

gdf,
polygon_kwargs=layer_kwargs,
scatterplot_kwargs=layer_kwargs,
path_kwargs=layer_kwargs,
map_kwargs=map_kwargs,
)
if m is not None:
new_m = m.add_layer(new_m)

return new_m


def _get_categorical_cmap(
categories: pd.Series,
cmap: str,
nan_color: str | NDArray[np.uint8] | NDArray[np.float64] | list[int],
alpha: float | None,
) -> NDArray[uint8]:
try:
from matplotlib import colormaps
except ImportError as e:
raise ImportError(
"this function requires the `matplotlib` package to be installed",
) from e

cat_codes = pd.Series(pd.Categorical(categories).codes, dtype="category")
# nans are encoded as -1 OR largest category depending on input type
# re-encode to always be last category
cat_codes = cat_codes.cat.rename_categories({-1: len(cat_codes.unique()) - 1})
unique_cats = categories.dropna().unique()
n_cats = len(unique_cats)
colors = colormaps[cmap].resampled(n_cats)(list(range(n_cats)), alpha, bytes=True)
colors = np.vstack([colors, nan_color])
temp_cmap = dict(zip(range(n_cats + 1), colors, strict=True))
return apply_categorical_cmap(cat_codes, temp_cmap)


def _query_name(name: str) -> CartoBasemap:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer just depending on xyzservices for the geopandas integration:

https://github.com/geopandas/geopandas/blob/4f9f361a2ba94db4c02e1fb5b6b2b09abf4278e6/geopandas/explore.py#L373-L386

I don't think we should implement our own mapping. After all, the cartodb basemap styles are defined in xyzservices too, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, I mean depending on it conditionally from this module, not as a required dependency of the package.

Also, soon, we'll support passing in an xyzservices object directly. See #494

"""Return basemap URL based on the name query (mimicking behavior from xyzservices).

Returns a matching basemap from name contains the same letters in the same
order as the provider's name irrespective of the letter case, spaces, dashes
and other characters. See examples for details.

Parameters
----------
name : str
Name of the tile provider. Formatting does not matter.

Returns
-------
match: lonboard.basemap

Examples
--------
>>> import xyzservices.providers as xyz

All these queries return the same ``CartoDB.Positron`` TileProvider:

>>> xyz._query_name("CartoDB Positron")
>>> xyz._query_name("cartodbpositron")
>>> xyz._query_name("cartodb-positron")
>>> xyz._query_name("carto db/positron")
>>> xyz._query_name("CARTO_DB_POSITRON")
>>> xyz._query_name("CartoDB.Positron")

"""
name_clean = name.translate(_QUERY_NAME_TRANSLATION).lower()
if name_clean in _BASEMAP_PROVIDERS:
return _BASEMAP_PROVIDERS[name_clean]

raise ValueError(f"No matching provider found for the query '{name}'.")