Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipyleaflet map is not rendered when ruhning in papermill #296

Closed
spektom opened this issue Jan 28, 2019 · 20 comments
Closed

ipyleaflet map is not rendered when ruhning in papermill #296

spektom opened this issue Jan 28, 2019 · 20 comments

Comments

@spektom
Copy link

spektom commented Jan 28, 2019

I have the following cell:

from IPython.display import display
from ipyleaflet import Map, Polyline

m = Map(center=(32.2, 34.8), zoom=12)
for r in df.itertuples():
    line = Polyline(
        locations=[[[r.start_latitude, r.start_longitude], [r.end_latitude, r.end_longitude]]],
        weight=15
    )
    m.add_layer(line)

display(m)

The map is displayed correctly, when running interactively. However, when running the same notebook using papermill it renders just this:

Map(basemap={'url': 'https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png', 'max_zoom': 19, 'attribution': 'Map …
@spektom
Copy link
Author

spektom commented Jan 28, 2019

This issue seems related: jupyter/nbconvert#751

@MSeal
Copy link
Member

MSeal commented Jan 28, 2019

Yes it is that underlying issue. However even finishing that PR won't actually render what you expect in most front-ends.

In general the widgets are only fully supported in JupyterLab. There's some specification gaps that prevent other systems from knowing what to provide in these cases for a front-end with an appropriate js package to render the widget state.

There's some long-running discussions on the topic I haven't been directly involved in. @jasongrout and @rgbkrk both have more context, but overall I think I can safely state that the display messages generated by executing widgets today rely on in-browser packages / state which backends like papermill (or nbconvert) won't be able to populate from within the kernel protocol.

From looking through a few generations of widgets the information made available to the kernel manager varies quite a bit and many circumvent the kernel's messaging protocol entirely. I believe this is the result of iterating the design of widgets but I think we should continue that conversation in this thread to help steer the direction of widget state in the notebook format so we can reliable populate the json document with the message outputs that front-ends which support the widget can interpret correctly.

@MSeal
Copy link
Member

MSeal commented Jan 28, 2019

A few areas where I see conflicts that are hard to resolve:

  1. Kernel display output for widgets not specifying the type of widget, default display information, and/or client rendering requirements. If the client needs a particular js package, it should perhaps be explicit in the output of the notebook.
  2. Widgets which rely on user input before releasing cell execution. There should be some sort of defined behavior encoded in the kernel for headless execution (an execution without a paired js client), or perhaps the kernel should fail fast and log a message that it couldn't satisfy the widget's requirements for user input.
  3. Widgets which rely on querying browser state. Either through querying specific DOM structures or earlier widget instances which hold state while the notebook is executing. In a headless execution pattern there's no way to reliably capture these expectations.
  4. Binary data being transferred between the browser and kernel / other endpoints without using the kernel message protocol or an extension of such.

There's additional execution constraints I probably missed, but these are the ones I am aware of from assisting with the jupyter systems I've used or submitted code against.

Would love to hear suggestions from the more experienced Jupyter core folks for how we can support the growing ecosystem's systems against the highly useful widgets moving forward.

@MSeal
Copy link
Member

MSeal commented Jan 28, 2019

One items to note, is that I'd strongly prefer to avoid launching paired JS execution context with papermill to enable widget execution. Maybe a separate papermill extension repo could support this but it's riddled with complexity and pitfalls that would push papermill from a simple and easy to understand project to a much heavier tool that requires a lot more network, disk, dependency management, and inter-process support.

@MSeal
Copy link
Member

MSeal commented Feb 9, 2019

We discussed the topic with some of the core widget maintainers this past week. There wasn't an immediate answer to solving this problem, though it was clear that the underlying issue of rendering widgets in naive clients isn't isolated to papermill or nbconvert. In particular specifying the package dependencies or default rendering would be a minimal step to supporting sane behavior in these circumstances. I'll open an issue on ipyleaflet to see if the developers there can coordinate to improve behavior for this interaction.

@jasongrout
Copy link

It seems that there is confusion or misconceptions about how widgets work in the above comments. It sounds like it would be good to have some more information in the widgets docs addressing some of these points to prevent confusion.

A couple of points (and I'm happy to discuss these in more detail to try to find a good way things can work) are below.

In general the widgets are only fully supported in JupyterLab.

Widgets are fully supported in the classic notebook, and very nearly fully supported in JupyterLab (basically JLab is missing one last piece that I'm working on now - making it possible to save widget state in a notebook.)

Kernel display output for widgets not specifying the type of widget, default display information, and/or client rendering requirements. If the client needs a particular js package, it should perhaps be explicit in the output of the notebook.

In general, widgets in the client side involve two parts:

  • a model which maintains state, which is not visible, and maintained through opening a comm channel from the kernel, and exists document-wide, and
  • zero or more output display messages in cell outputs, which contain a reference to the model that should be displayed (i.e., they just contain the id of the model to display).
    Both pieces of information need to be saved to the document in order for a widget to display. This is what happens in https://jupyter.org/widgets and the ipywidgets docs - the widget state is saved to the notebook metadata, and the widget manager can use that and the output messages in specific cells to display widgets correctly. This widget model metadata at the notebook level essentially specifies what js package and version to use to interpret the data. Since a single model can be displayed multiple places, we chose not to copy all of this information into each output, rather having each output reference the notebook-level state.

CC also @SylvainCorlay, @maartenbreddels

Widgets which rely on user input before releasing cell execution. There should be some sort of defined behavior encoded in the kernel for headless execution (an execution without a paired js client), or perhaps the kernel should fail fast and log a message that it couldn't satisfy the widget's requirements for user input.

Widgets in general don't rely on user input before releasing cell execution (actually, I don't know of one that does do this - do you have an example?). The general model is to display the widget state as-is, and communicate changes in that state back to the kernel via kernel comm messages.

Widgets which rely on querying browser state. Either through querying specific DOM structures or earlier widget instances which hold state while the notebook is executing. In a headless execution pattern there's no way to reliably capture these expectations.

This is a good point - probably many widget libraries assume they are running in a browser DOM, and have not been tested in a headless context. Do you have some specific examples in mind?

Binary data being transferred between the browser and kernel / other endpoints without using the kernel message protocol or an extension of such.

Binary data, as with all widget communication that I know about, uses the standardized Jupyter comm message protocol, which is part of the kernel message protocol. Do you have an example where that is not happening?

Again, I'm happy to discuss these and other points further to help. CC also @SylvainCorlay, @maartenbreddels, @pbugnion.

@jasongrout
Copy link

In the example code:

from IPython.display import display
from ipyleaflet import Map, Polyline

(A) m = Map(center=(32.2, 34.8), zoom=12)
for r in df.itertuples():
(B)    line = Polyline(
        locations=[[[r.start_latitude, r.start_longitude], [r.end_latitude, r.end_longitude]]],
        weight=15
    )
    m.add_layer(line)

(C) display(m)

(A) and (B) are where widget objects are created, which create comm channels, which in turn create js objects in the browser. Each of these widgets has a unique id. (C) is where an output message is sent to the browser which contains the unique id of the map widget, and the renderer takes that id, looks up the relevant state in the notebook-wide widget state, and uses the state to retrieve the relevant js library constructors and construct and display the widget.

@jasongrout
Copy link

So to minimally support widgets, it might be enough to (at a high level):

A. run a widget manager in the headless browser, which is in charge of maintaining this document-wide widget state
B. expose the relevant widget libraries in the headless browser, so that the widget manager can instantiate the relevant widget models
C. Save the widget manager state to the notebook metadata, so that the state is in the documented format for widget state in a notebook.
D. Make the relevant js available in the context in which you are displaying the notebook, so that it can read the widget model state, and can display the appropriate output.

@MSeal
Copy link
Member

MSeal commented Feb 9, 2019

Thanks for the detailed information and corrections @jasongrout !

I can't remember the widget name, but there was a widget where the user had to enter a js prompt before continuing execution and that blocked the kernel from continuing. I remember someone asking about how it should work with papermill at JupyterCon. It sounds like that's not a common or highly adopted widget pattern.

@jasongrout
Copy link

It sounds like that's not a common or highly adopted widget pattern.

Indeed. It sounds kind of impossible to me, actually. The message back to the kernel would be blocked from processing until the cell executed, so I'm not sure how things would get unblocked.

We do have the notion of waiting for user input, but it is in an asynchronous context (https://ipywidgets.readthedocs.io/en/stable/examples/Widget%20Asynchronous.html), so it wouldn't block cell execution from finishing. Really, I don't know that there is a way to synchronously block and get user input except for the kernel stdin messages (which widgets aren't using that I know of). Perhaps something interesting is possible with the new async kernel? I haven't played too much with it yet.

@MSeal
Copy link
Member

MSeal commented Feb 9, 2019

I think the widget in question was circumventing blocking by repeatedly polling the widget state until it was complete somehow. I'll look for examples of the other patterns I saw in old notebooks that were trying to replicate some zeppelin behavior via widgets. But we've tapped out my knowledge (and limited understanding) of the widget space and there's several thousand notebooks in the repository where I saw issues in running with papermill.

It'd probably be more productive to focus on the patterns you see as within normal use of the protocols and tackle one of the proposed paths of capturing enough information for rendering purposes with current best-practices in widgets.

A. run a widget manager in the headless browser, which is in charge of maintaining this document-wide widget state

Would this widget manager require js though? I'd prefer to not make all notebook clients require a packagable js client to be active for both security and complexity reasons.

B. expose the relevant widget libraries in the headless browser, so that the widget manager can instantiate the relevant widget models

Same concerns as A)?

C. Save the widget manager state to the notebook metadata, so that the state is in the documented format for widget state in a notebook.

This sounds possible -- though you were mentioning this could be a very large payload? Maybe we could make this an easy option to opt into for widget where this makes sense. That would allow a path for widgets to expand flexibility to headless execution paths when it makes sense but not force all widgets to conform when it doesn't make sense.

D. Make the relevant js available in the context in which you are displaying the notebook, so that it can read the widget model state, and can display the appropriate output.

Is this idea building on the option of providing more details for a UI client later being able to request needed js packages from the user/maintainer if they aren't present? That sounds helpful at a minimum even if the final rendering would not have all the state captured from the prior headless run.

E) Add an alternative txt/html output display for widgets so clients missing the required js package can render something (even an error message if nothing else productive can be done).

@jasongrout
Copy link

Add an alternative txt/html output display for widgets so clients missing the required js package can render something (even an error message if nothing else productive can be done).

We currently have a text representation of the widget, as noted in the original post: Map(basemap={'url': 'https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png', 'max_zoom': 19, 'attribution': 'Map …

Would this widget manager require js though?

You get to decide the widget manager. You can have it do whatever you want. From the point of view of the kernel, it's just comm messages to/from a client.

So your widget manager could just store the state from the kernel (i.e., not compute anything, just act as a state store). Where it can get tricky is that some widgets (not standard ones, IIRC) do some state initialization on the js side, and communicate that back to the kernel. In those cases, you'd need that widget code running in the client, or at least an approximation. But that's probably not the pattern to focus on first, like you say.

though you were mentioning this could be a very large payload

Possibly - depends on the widget and how it is used. For bqplot or ipyvolume, this could be the entire data set array. Or an image widget might have the entire source of the widget. Or in those cases, perhaps it's just a pointer to the data. There definitely isn't any forcing here, though.

Is this idea building on the option of providing more details for a UI client later being able to request needed js packages from the user/maintainer if they aren't present? That sounds helpful at a minimum even if the final rendering would not have all the state captured from the prior headless run.

Your widget manager displaying widgets in that final output would manage how to get packages. It would have the information in the widget state, which includes a name and version for the state, which (usually) corresponds to a package and version of code needed to display the state. For example, IIRC currently the html widget manager (used in jupyter.org/widgets, etc.) fetches the widget packages on the fly from cdn, based on the widget state stored in the page.

@jasongrout
Copy link

The architecture is very flexible. There's a key part, the client-side widget manager, which you control, and it is responsible for (a) providing the right js packages for whatever widget state there is and managing communication with kernel widget objects, (b) displaying the widgets in whatever context you are in. You get full control over the widget manager by design.

@jasongrout
Copy link

One interesting question is how do you mesh the interactive nature of widgets with the assumptions about reading a report. For example, in the case of an ipyleaflet display: that display can be modified by cells below the initial display. In that first cell, should you display the state of that ipyleaflet as it was at the end of the notebook (which is what it would look like if you just hit 'run all')? Or should you display the ipyleaflet as it appeared when it was first executed? For example, in one cell you might have a map zoomed into an area, but in some later cells you may add some highlights, move the zoom, etc. In that first cell, do you display the original map state, or the final map state? In interactive use, we display the current state (i.e., the ending map state). However, in a report you're reading top to bottom, that can be confusing to see the ipyleaflet in its final state.

I think this is a natural tension between reproducible workflows and fluid interactive use.

@MSeal
Copy link
Member

MSeal commented Feb 10, 2019

In the papermill case under ideal circumstances I'd expect the widgets to render the same as a Restart and Run All from any other UI, since that client is simply running the cells in order. Tqdm and other display objects show the final state when run through nbconvert for rendering into other formats, respecting the display.

We provide the ability to log the outputs to disk, which can show progression of state if needed. I think it'd be odd for widgets to not follow the same behavior and if the user wanted incremental displays they should use separate objects in each cell they want isolated. With display_id this should be easy to control as the user desires.

@MSeal
Copy link
Member

MSeal commented Feb 10, 2019

I'm not sure what the next step is here.

Since the widget manager has so much freedom, it'd be hard to replicate any kind of manager for the python clients, as the business logic is all encoded with the js package that the manager is loading? Maybe we should explore the widget state saving to outputs or metadata so long as that the kernel alone could produce such state?

@maartenbreddels
Copy link

maartenbreddels commented Feb 10, 2019 via email

@jasongrout
Copy link

jasongrout commented Feb 10, 2019

Thanks! jupyter/nbconvert#900 looks like what I was imagining, but even better, since it looks like you just serialize the python state, without even bothering having a js side. Brilliant!

jupyter/nbconvert#901 is also intriguing - thanks for exploring that as well!

@MSeal - I'd say the next step is reviewing jupyter/nbconvert#900 and seeing if something like that fits your needs for getting widget state into the document.

@MSeal
Copy link
Member

MSeal commented Feb 10, 2019

Perfect. I'll dig into those PRs this upcoming week @maartenbreddels .

@jasongrout thanks for the detailed engagement here. I'll kick the thread back up after going through / merging those PRs.

@MSeal
Copy link
Member

MSeal commented Apr 27, 2019

This should now be fixed with papermill 1.0! (I tested it with some examples from ipyleaflet github). You do need to click "Trust Notebook" from "File" after it runs to enable javascript execution.

@MSeal MSeal closed this as completed Apr 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants