JupyLabel Outline

Research

The corresponding paper can be found under: https://arxiv.org/abs/2403.07562

Running the JupyLabel CLI using Conda

Setup the environment and install the project with the following commands:

conda create -n cli python=3.10
pip install .
pip install -r requirements.txt

Create the inputs, outputs and backups folders in "./src/pipeline_analyzer/jn_analyzer/resources/".

mkdir inputs outputs backups

Commands

Place all .ipynb files into the inputs folder. The default path is "./src/pipeline_analyzer/jn_analyzer/resources/inputs/".

All labeled Notebooks and intermediate files, when running in DEBUG_MODE, can be found in "./src/pipeline_analyzer/jn_analyzer/resources/outputs/" and "./src/pipeline_analyzer/jn_analyzer/resources/backups/".

After that you can analyze the notebooks with the following command: analyze label-notebooks to label all notebooks in the inputs folder. You can provide the following options:

--path # input folder path (default: "./src/pipeline_analyzer/jn_analyzer/resources/inputs/")
--debug # debug mode (default: False)
--headers # inserting headers instead of tags (default: True)

analyze eval to evaluate performance metrics. analyze bench {dataset} to benchmark on 1000 notebooks. Dataset is either jupylab or headergen. When using jupylab it will benchmark on 1000 Jupyter Notebooks and when using headergen it will use the 15 notebooks provided by headergen. This command also has the following options:

--debug # debug mode (default: False)
--headers # inserting headers instead of tags (default: True)

analyze new to prepare data from scratch to train the models. This is especially usefull, if you want to change the pre-processing of JupyLab and investigate how it influences the model performance. This command also provides the following option:

--all # Trains and evaluates the models after training them on the newly pre-processed data. The models are saved in the resources folder under new_trained_models (default: no)

When creating new models from scratch, simply delete/overwrite the models that currently exist in the /resources/models folder.

Docker Installation

docker build -t jupylab .
docker run -it -v ${PWD}/jupylab:/jupylab jupylab bash
cd src/pipeline_analyzer/jn_analyzer/resources
mkdir inputs outputs backups

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
docs		docs
src		src
tests		tests
.gitignore		.gitignore
AUTHORS.rst		AUTHORS.rst
CHANGELOG.rst		CHANGELOG.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
Dockerfile		Dockerfile
LICENSE.txt		LICENSE.txt
README.md		README.md
README.rst		README.rst
cli_run_dist.json		cli_run_dist.json
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

JupyLabel Outline

Research

Running the JupyLabel CLI using Conda

Commands

Docker Installation

About

Releases

Packages

Languages

License

m1guelperez/jupylab_cli

Folders and files

Latest commit

History

Repository files navigation

JupyLabel Outline

Research

Running the JupyLabel CLI using Conda

Commands

Docker Installation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages