Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

    TabArena Logo

Awesome TabArena Use Cases 🌻

Here we share examples for various use cases of TabArena's code, a benchmarking framework for tabular data.

📊 Benchmarking Predictive Machine Learning Models

You can use TabArena for various benchmarking tasks, such as benchmarking TabArena's models on new data (including private offline data) or benchmarking your models on TabArena's data.

  • Folder: benchmarking/
  • Use Cases:
    • run_quickstart_tabarena.py - Reproduce running LightGBM and RealMLP on three datasets from TabArena
    • custom_tabarena_model/ - Implement your own model for TabArena and benchmark it on TabArena-Lite
    • run_get_tabarena_datasets_from_openml.py - Get the data used by TabArena from OpenML, without the TabArena framework
    • run_quickstart_tabarena_custom_datasets.py - Benchmarking models with TabArena on your custom (private) datasets.
    • run_quickstart_tabarena_one_datasets.py - Benchmarking and evaluating models with TabArena on one dataset.
    • run_beta_tabpfn_end_to_end.py - An example of how to load and compare new results artifacts using TabArena.

🚀 Using SOTA Tabular Models Benchmarked by TabArena

All models in TabArena are open-source and can be used directly on your own data. They are implemented in production-ready code and can be easily integrated into your ML pipelines.

  • Folder: running_tabarena_models/
  • Use Cases:
    • run_default_model.py - Minimal example for running a default model (with cross-validation bagging).
    • run_tuned_ensemble_model.py - Minimal example for running a tuned (+ ensembled) model.
    • run_tabarena_realmlp.py - Simple example for running TabArena's version of RealMLP.
    • run_autogluon_on_openml_task.py - Minimal example for running AutoGluon on any OpenML task.

🗃️ Analysing Metadata and Meta-Learning

The metadata generated by prior TabArena experiments can be used for more than just comparing new models. You can inspect the rich metadata we collected for each dataset and use it for insightful studies or meta-learning.

  • Folder: meta/
  • Use Cases:
    • inspect_processed_data.py - Inspect the processed data from prior benchmarks.
    • inspect_raw_data.py - Inspect the raw data from prior benchmarks.
    • run_download_raw.py - Download all the raw data from prior benchmarks.

📈 Generating Plots and Leaderboards

To inspect the results of TabArena benchmarks, we can use various plots and leaderboards. We share code to generate these visualizations from the raw benchmark results.

  • Folder: plots/
    • Use Cases:
    • run_generate_main_leaderboard.py - Generate the main leaderboard from TabArena results.
    • run_generate_paper_figures.py - Generate the figures used in the TabArena NeurIPS'2025 paper.
    • run_plot_pareto_over_tuning_time.py - Generate the plots showing trade-offs of predictive performance and efficiency.

🔁 Reproducibility

To locally reproduce individual configurations and compare with the TabArena results of those configurations, refer to benchmarking/run_quickstart_tabarena.py.

To locally reproduce all tables and figures in the paper using the raw results data, run plots/run_generate_paper_figures.py.

To locally generate the latest results leaderboard, run plots/run_generate_main_leaderboard.py.