Skip to content

Pull requests: EleutherAI/lm-evaluation-harness

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

make utility function to handle until
#2518 opened Nov 26, 2024 by baberabb Loading…
Filters bugfix
#2517 opened Nov 26, 2024 by baberabb Loading…
max_length not used
#2515 opened Nov 25, 2024 by lintangsutawika Loading…
AraDICE task config file
#2507 opened Nov 19, 2024 by firojalam Loading…
fixed mmlu generative response extraction
#2503 opened Nov 18, 2024 by RawthiL Loading…
Added regex filter for bbh fewshot
#2502 opened Nov 18, 2024 by RawthiL Loading…
Add GigaChat API
#2495 opened Nov 15, 2024 by seldereyy Draft
Yaml crowspairs tasks
#2488 opened Nov 14, 2024 by NAM00 Loading…
Biology ds
#2486 opened Nov 13, 2024 by deema-A Loading…
MILU dataset from AI4Bharat for Indic LLM eval
#2482 opened Nov 12, 2024 by abhinand5 Loading…
Update citation
#2474 opened Nov 8, 2024 by Sypherd Loading…
Use global filter alias
#2473 opened Nov 8, 2024 by Sypherd Loading…
allow fewshots for multimodal tasks
#2450 opened Nov 1, 2024 by artemorloff Loading…
Add Aggregation for Kobest Benchmark
#2446 opened Oct 31, 2024 by tryumanshow Loading…
fix tmlu tmlu_taiwan_specific_tasks tag
#2420 opened Oct 22, 2024 by nike00811 Loading…
Add YandexGPT API
#2419 opened Oct 21, 2024 by almasgarriev Loading…
Fix Type Hints for vLLM CausalLM model
#2408 opened Oct 18, 2024 by qthequartermasterman Loading…
Update citation links to Zenodo and DOI to 0.4.5
#2391 opened Oct 9, 2024 by LSinev Loading…
add Russian mmlu
#2378 opened Oct 3, 2024 by tatiana-iazykova Loading…
Add the BlueBench benchmark
#2369 opened Oct 1, 2024 by shachardon Loading…
MMLU Pro Plus
#2366 opened Sep 30, 2024 by asgsaeid Loading…
ProTip! Follow long discussions with comments:>50.