Data Analysis Crow: A Jupyter Notebook Agent

Data Analysis Crow is an AI agent framework designed to perform complex scientific data analysis tasks by iteratively working through Jupyter notebooks. This agent takes in datasets and prompts, then systematically explores, analyzes, and interprets the data to provide comprehensive answers and insights.

The agent was used to produce the trajectories for the BixBench benchmark.

Key Features

Accepts datasets and natural language prompts
Iteratively builds Jupyter notebooks to answer research questions
Works with Python, R, and Bash code execution
Specializes in bioinformatics analysis but adaptable to various domains
Comes with a Docker image including most common bioinformatics packages

Installation

# Clone the repository
git clone https://github.com/Future-House/data-analysis-crow.git
cd data-analysis-crow

# Install dependencies
pip install -e .

# OPTIONAL:pull the docker image with bioinformatics packages
docker pull futurehouse/bixbench:aviary-notebook-env

Prerequisites

API Keys

We support all LLMs that are supported by litellm. Create a .env file with the API keys for the LLMs you want to use. For example:

OPENAI_API_KEY = "your-openai-api-key"
ANTHROPIC_API_KEY = "your-anthropic-api-key"

Using the Agent

The agent works by taking a dataset and a prompt, then iteratively building a Jupyter notebook to answer the question. Visit the tutorial for a simple step-by-step guide on how to use the agent.

Advanced Usage

For advanced evaluations, you can configure server.yaml and runner.yaml in the src/scripts/bixbench_evaluation directory and then run the evaluation script:

bash src/scripts/bixbench_evaluation/run.sh

This will:

Load the specified dataset
Process the prompt to understand the research question
Generate a Jupyter notebook with progressive analysis steps
Provide a final answer based on the analysis

Results are saved in the output directory specified in your configuration file.

Note that the dataset and environment configuration must be updated appropriately. For an example, see dataset.py which includes the capsule dataset configuration used for the BixBench benchmark.

We also recommend visiting the BixBench repository where we share a full evaluation harness for the agent.

Hosted Agent

Coming soon!

BixBench Benchmark

Data Analysis Crow was used to produce the trajectories for the BixBench benchmark, which evaluates AI agents on real-world bioinformatics tasks.

BixBench tests AI agents' ability to:

Explore biological datasets
Perform long, multi-step computational analyses
Interpret nuanced results in the context of a research question

You can find the BixBench dataset in Hugging Face, the paper here, and the blog post here.

Running BixBench Evaluations

To use this agent for BixBench evaluations, we recommend visiting the BixBench repository for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows		.github/workflows
src		src
tests		tests
tutorial		tutorial
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Analysis Crow: A Jupyter Notebook Agent

Key Features

Links

Installation

Prerequisites

API Keys

Using the Agent

Advanced Usage

Hosted Agent

BixBench Benchmark

Running BixBench Evaluations

About

Releases 1

Packages

Languages

License

Future-House/data-analysis-crow

Folders and files

Latest commit

History

Repository files navigation

Data Analysis Crow: A Jupyter Notebook Agent

Key Features

Links

Installation

Prerequisites

API Keys

Using the Agent

Advanced Usage

Hosted Agent

BixBench Benchmark

Running BixBench Evaluations

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages