Skip to content
This repository has been archived by the owner on Apr 29, 2021. It is now read-only.

Persist provenance graph outside notebook #3

Open
sverhoeven opened this issue Sep 19, 2018 · 2 comments
Open

Persist provenance graph outside notebook #3

sverhoeven opened this issue Sep 19, 2018 · 2 comments

Comments

@sverhoeven
Copy link
Contributor

Now the provenance graph is stored in the metadata field of the notebook file.

This is not nice because:

  • Notebook should be without history
  • Provenance graph gets big fairly quickly, storing it as json does not scale.

In https://github.com/activityhistory/nbcomet they use a sqlite file for each notebook in ~/.jupyter/nbcomet, we could do something similar.

@thinkh
Copy link
Member

thinkh commented Oct 2, 2018

We could try turtleDB for client-side database and synchronization with the server. However, the backend part tortoiseDB (via a second package) is based on MongoDB and npm. So we probably have to write a server-side adapter for the SQLite storage.

@thinkh
Copy link
Member

thinkh commented Oct 2, 2018

Other ideas:

  • Research how jupyter checkpoints work and when a checkpoint is created?
  • Alternative: Create subdirectory .ipynb_provenance and store provenance data as JSON files
  • Evaluate if a local git repo for provenance data could be an option?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants