Skip to content

mdda/LLM-abstraction-for-ARC

Repository files navigation

LLM-abstraction-for-ARC

This repo is a copy of the relevant code for the dataset paper "Capturing Sparks of Abstraction for the ARC Challenge".

  • Quick-start : Have a look at the Jupyter Notebooks (pre-rendered) in ./notebooks

Assets made available (Apache 2.0 license)

  • The arc_mdda module is ARC-related code in modularised form, mostly built by first testing within notebooks
  • The Gemini-Flash-002 generated 'Sparks of Abstraction' dataset is downloadable via this link
  • The instructions below include the incorporation of the related repo arc-dsl-llm

Get external libraries/data

mkdir -p ./External

pushd External
git clone [email protected]:fchollet/ARC-AGI.git
# Data is in ./External/ARC-AGI/data/{training,evaluation} (400 each)
popd

pushd External
git clone [email protected]:mdda/arc-dsl-llm.git
# Repo is in ./External/arc-dsl-llm/*
#   NB: it includes a sneaky internal link to arc_dsl to 'modularise' it
popd

Using the Gemini-LLM

The Gemini-Flash-002 model is used via arc_mdda/models/gemini.py, and will use (by default) the VertexAI credentials you provide in ./key-vertexai-iam.json

export GOOGLE_APPLICATION_CREDENTIALS="./key-vertexai-iam.json"

The code also allows for usage of the $FREE Gemini API (for which you'll need to add a free=True flag to the get_model() calls).

Library installation

uv pip install jupytext requests frozenlist                 # Basics
uv pip install vertexai google-generativeai omegaconf       # LLM access
uv pip install llvmlite umap-learn hdbscan                  # visualisation
uv pip install matplotlib pandas datashader bokeh holoviews # visualisation

Examining / Running the notebooks

  • NB: To just have a look at the notebook outputs, see : ./notebooks/*.ipynb (as expected)

jupytext has been used within JupyterLab for the notebooks : This means that the actual saved-to-github code is the the .py files in the main directory, which should be run in JupyterLab (say) using the jupytext plugin, and choosing Open as Notebook on the .py file.

The local notebook contents is stored to cache-notebooks, and not checked into the repo. i.e. the following was done:

jupytext --set-formats cache-notebooks//ipynb,py XYZ.py

Citing this work

If you find this helpful in your research, please consider citing:

@misc{andrews2024capturingsparksabstractionarc,
      title={Capturing Sparks of Abstraction for the ARC Challenge}, 
      author={Martin Andrews},
      year={2024},
      eprint={2411.11206},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2411.11206}, 
}

Acknowledgements

Support for this research was provided by the Google AI/ML Developer Programs team, including access to the Gemini models and GPUs on Google Cloud Platform.

About

Code for the dataset paper "Capturing Sparks of Abstraction for the ARC Challenge"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published