Exposure Analysis: Exporting analysis process to a Jupyter notebook has several challenges on execution. #394

jprussoibanez · 2023-03-20T15:03:54Z

Context:

Exporting the exposure analysis to a Jupyter notebook on this ShowWhy project (based on DoWhy use case) to automate the analysis process.

Problem:

Running the exported notebook out of the box has some configuration challenges to make it run after exportation. These are the main challenges:

Imports usage: It’s challenging to run imports directly on the codebase without a standalone package. It’s not clear how to better integrate the exported notebook (and its imports) on ShowWhy ecosystem.
Data integration: The notebook seem to use Redis database with some pre-defined configuration that does not work out of the box. It’s not clear how to better integrate the project data on the exported notebook itself.

Here a more detailed explanation for each challenge:

Imports:

General imports: All imports on the notebook are rooted on exposure folder. However, if we want to run the notebook directly within the codebase some internal ShowWhy packages uses backend.exposure imports. Therefore, to make it work within the codebase we need to use backend.exposure on the notebook itself as well (or do some workaround on PYTHONPATHs).

Deprecated imports: Some imports seem deprecated from some older package folder structure.

The exposure.io.storage does not exist any more on current main branch. The new package home seems to be worker_commons.io.storage. BTW: This should be an easy fix to change the code for generating the output notebook.

Data integration:

The estimator uses Redis LocalStorageClient to get_tasks by retrieving a local_notebook workspace and output.csvfile. Both does not exist on Redis database.

Summary and open questions:

ShowWhy exporting feature seems like a really nice scenario to first explore and prototype causal analysis through the user interface, and then automate the final analysis process through a Jupyter notebook or script.

Is there a plan to be able to run the Jupyter notebook as a standalone (without depending on Redis database, dockers and codebase itself) to facilitate the exportation process?
Here some additional ideas on this:
- ShowWhy library: Maybe a ShowWhy library package to use directly can help avoid all the hassle of imports and data dependencies on the dockers and codebase.
- Leverage DoWhy: ShowWhy does wrap some features from DoWhy more mature standalone library. Then maybe another option is to directly use DoWhy features for better analysis automation on the notebook exportation.

Version information

DoWhy version is the latest on the main branch.
Running dockers locally on a Mac M1, so needed to adjust some of the library's dependencies to make it work within Tensorflow docker restrictions.

The text was updated successfully, but these errors were encountered:

jprussoibanez changed the title ~~Exposure Analysis: Exporting analysis process to a Jupyter notebook has several challenges to execute it.~~ Exposure Analysis: Exporting analysis process to a Jupyter notebook has several challenges on execution. Mar 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exposure Analysis: Exporting analysis process to a Jupyter notebook has several challenges on execution. #394

Exposure Analysis: Exporting analysis process to a Jupyter notebook has several challenges on execution. #394

jprussoibanez commented Mar 20, 2023

Exposure Analysis: Exporting analysis process to a Jupyter notebook has several challenges on execution. #394

Exposure Analysis: Exporting analysis process to a Jupyter notebook has several challenges on execution. #394

Comments

jprussoibanez commented Mar 20, 2023

Context:

Problem:

Imports:

Data integration:

Summary and open questions:

Version information