Skip to content

eci-io/climategpt-evaluation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 

Repository files navigation

ClimateGPT Evaluation

Prompt templates for climate-specific tasks can be found under tasks.

To evaluate an LLM on climate-specific tasks, follow the steps below:

  1. Clone this repo
  2. Clone and install EleutherAI/lm-evaluation-harness (the big-refactor branch)
  3. The templates from this repo can be easily integrated with lm-evaluation-harness by either:
    a. placing the tasks from tasks directory under lm_eval/tasks/ (in lm-evaluation-harness) and running evaluation using the following command:
    lm_eval lm-evaluation-harness/main.py \
    	--model hf \
    	--model_args pretrained=tiiuae/falcon-7b \
    	--tasks claim_binary \
    	--output_path /results/falcon-7b.jsonl \
    	--show_config --log_samples \
    	--num_fewshot 5
    

OR
b. by directly passing the paths to the tasks as command-line arguments using --include_path. An example command is shown below:

    lm_eval \
    	--model hf \
    	--model_args pretrained=tiiuae/falcon-7b \
    	--tasks claim_binary \
    	--output_path /results/falcon-7b.jsonl \
    	--show_config --log_samples \
    	--num_fewshot 5 --include_path <path/to/this/repo/tasks/exeter>

Additional info

HuggingFace Link to Climate Evaluation Datasets | Paper Link

Citation Information

@misc{thulke2024climategpt,
      title={ClimateGPT: Towards AI Synthesizing Interdisciplinary Research on Climate Change}, 
      author={David Thulke and Yingbo Gao and Petrus Pelser and Rein Brune and Rricha Jalota and Floris Fok and Michael Ramos and Ian van Wyk and Abdallah Nasir and Hayden Goldstein and Taylor Tragemann and Katie Nguyen and Ariana Fowler and Andrew Stanco and Jon Gabriel and Jordan Taylor and Dean Moro and Evgenii Tsymbalov and Juliette de Waal and Evgeny Matusov and Mudar Yaghi and Mohammad Shihadah and Hermann Ney and Christian Dugast and Jonathan Dotan and Daniel Erasmus},
      year={2024},
      eprint={2401.09646},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages