Releases: allenai/natural-instructions
v2.8
v2.7
What's Changed
- fix typos by @manandey in #770
- Addressing data leakage by @yeganehkordi in #774
- Fix a small typo in the tasks README by @aviaefrat in #775
New Contributors
- @aviaefrat made their first contribution in #775
Full Changelog: v2.6...v2.7
v2.6
What's Changed
- Update tasks 200 - 600 based on human feedback by @Palipoor in #761
- Address feedbacks for tasks 600 - 800 by @Palipoor in #762
- Addressing crowdworker feedback by @yeganehkordi in #760
- Update README.md by @swarooprm in #763
- Add standard evaluation setup by @yizhongw in #764
- Fix typo by @manandey in #765
- Fix arg name at rouge by @lkm2835 in #766
- Fix typos by @manandey in #767
- Updates for setting up the leaderboard by @yizhongw in #768
- Add description of how to produce reference file by @yizhongw in #769
New Contributors
Full Changelog: v2.5...v2.6
v2.5
What's Changed
- Update task json files with new sources, urls and categories. by @yizhongw in #742
- Add web nlg, personachat, quartz and wiqa by @Palipoor in #740
- Add a few lines to print the average number of instances by @Palipoor in #743
- Add missing tasks from prompt source by @yeganehkordi in #744
- Add missing tasks from the prompt source by @yeganehkordi in #746
- Fix a typo in the sources by @yeganehkordi in #749
- Change order in the task 1296 by @yeganehkordi in #747
- Drop extra words from the definitions by @yeganehkordi in #750
- Drop extra words from definitions by @yeganehkordi in #751
- Update task1509_evalution_antonyms by @yeganehkordi in #753
- Update task 575 and task 1599 by @yeganehkordi in #756
- Create a single file for crowd annotation using the tasks in test categories. by @yizhongw in #758
- Fix some lines that had extra columns in README by @Palipoor in #754
New Contributors
Full Changelog: v2.4...v2.5
v2.4
What's Changed
- Crowdworker evaluation of the tasks [Work in Progress] by @danyaljj in #276
- Revising categories: Tasks1-500 by @yeganehkordi in #695
- Remove a misleading phrase from instructions in tasks 702, 711 and 712 by @Palipoor in #697
- Add examples to task 697 and modify some instances to fix #661 by @Palipoor in #698
- Drop 1562 and 1563 because of vagueness reported in #685 by @Palipoor in #699
- Fix the imbalance issue in instances by @Palipoor in #702
- Task050 label imbalance bug fix by @swarooprm in #704
- Rename the duplicate task 119 by @Palipoor in #708
- Add English instances to task 265 by @Palipoor in #707
- Revising remaining tasks by @yeganehkordi in #705
- Update test_all by @yeganehkordi in #710
- Remove duplicate and incorrect domains for alt tasks and drop two tasks by @Palipoor in #709
- remove URLs from the tasks by @yeganehkordi in #711
- Adding reasoning field - tasks1-50 by @yeganehkordi in #712
- Adding reasoning field- tasks51-200 by @yeganehkordi in #713
- Adding reasoning-tasks201-400 by @yeganehkordi in #714
- Adding reasoning- tasks401-700 by @yeganehkordi in #715
- Adding reasoning- tasks701-1014 by @yeganehkordi in #716
- Adding reasoning- tasks1015-1316 by @yeganehkordi in #717
- Adding reasoning- tasks1317-end by @yeganehkordi in #718
- Update test and hierarchy by @yeganehkordi in #720
- Fix unicode characters by @Palipoor in #723
- Drop qualitative reasoning by @yeganehkordi in #724
- Tasks1385-1390: ANLI, CB, HellaSwag, WSC by @yeganehkordi in #719
- Script for updating categories, domains, and reasoning by @yeganehkordi in #722
- Change definitions into lists of strings by @Palipoor in #725
- Move Mathematics and Ethical Judgement to reasoning by @yeganehkordi in #727
- Drop factual reasoning by @yeganehkordi in #728
- Fix map files by @yeganehkordi in #729
- Winogrande tasks by @yeganehkordi in #730
- Drop social commonsense reasoning and science branches by @yeganehkordi in #731
- Merge tasks 288 and 1578 by @Palipoor in #732
- Merge all kpa tasks by @Palipoor in #733
- Update source and domain of aquamuse tasks by @yeganehkordi in #735
- Add script to covert "prompt-source" tasks by @yeganehkordi in #734
- Update convert_prompt.py by @yeganehkordi in #736
- Add update_categories script by @yeganehkordi in #741
- Create prompt_task_map.json by @yeganehkordi in #737
Full Changelog: v2.3...v2.4
v2.3
More improvements: fixed bugs and improved the task category assignments.
What's Changed
- Fix for Issue #652. by @RushangKaria in #680
- Fix #488. Marked all pronouns in all instances. by @pulkitverma25 in #676
- Matres tasks enhancement by @yeganehkordi in #684
- Update task1489_sarcasmdetection_tweet_classification.json by @danyaljj in #688
- Tasks 1400-1425: Assigning categories, domains and addressing feedback by @yeganehkordi in #651
- Tasks 1100-1200: Assigning categories and domains by @yeganehkordi in #683
- Tasks 1550-1600: Assigning categories and domains by @yeganehkordi in #687
- Tasks 1600-1726: Assigning categories and domains by @yeganehkordi in #690
- Final version of task-hierarchy by @yeganehkordi in #691
- Drop links from TriviaQA by @yeganehkordi in #689
- Task 901-1000 Updated Files by @aarunku5 in #693
Full Changelog: 2.2...v2.3
v2.2
Fixing bugs and improving the task categorization.
What's Changed
- Fix the missing perspectives in task 738 by @Palipoor in #646
- Fixed human eval issues in tasks 1700+ and some grammar issues by @pulkitverma25 in #644
- Fixing issue #592 - Updating task 564 by @kurbster in #636
- Address feedback for tasks 1300 1400 by @Palipoor in #638
- Addressed Feedback for tasks 738-800 by @pulkitverma25 in #647
- Address feedbacks for tasks 1600 - 1657 by @Palipoor in #641
- Tasks 551-575: Assigning categories, domains, and addressing feedbacks by @yeganehkordi in #594
- Fixes Issue #577. by @RushangKaria in #649
- improve HateEval tasks by @XudongOliverShen in #650
- update tasks 1601 and 1602 to address issue #639 by @Palipoor in #653
- Update examples to Croatian / fix characters tasks 1626 - 1629 by @Palipoor in #643
- Tasks 1283-1300: Assigning categories, domains and addressing feedback by @yeganehkordi in #625
- Update Categories and Domains for Task161-180 by @ghlai9665 in #593
- update categories and domains for task181-199 by @ghlai9665 in #606
- Tasks 1300-1325: Assigning categories and domains by @yeganehkordi in #659
- Tasks 1425-1450: Assigning categories, domains and addressing feedback by @yeganehkordi in #656
- fixed #524 by @pulkitverma25 in #655
- MMMLU task improvements by @Sujan242 in #654
- improve Jigsaw tasks by @XudongOliverShen in #648
- Tasks 1325-1350: Assigning categories and domains by @yeganehkordi in #660
- Tasks 1200-1225: Assigning categories, domains and addressing feedbacks by @yeganehkordi in #601
- Added negative examples for MMMLU tasks. Fixes #569. by @pulkitverma25 in #663
- Removed incomplete instances from task667. Fixes #664. by @pulkitverma25 in #665
- Assigning Categories to Tasks 601 - 700 by @aarunku5 in #657
- categories & domains for task300-321 by @ghlai9665 in #666
- Tasks 1350-1400: Assigning categories and domains by @yeganehkordi in #667
- Tasks1000-1100: Assigning categories and domains by @yeganehkordi in #670
- fixed explanation of task456 (fixes #581) by @pulkitverma25 in #669
- Task322-399 by @ghlai9665 in #668
- Updating Categories from Tasks 700-800 by @aarunku5 in #671
- Updating categories Tasks 801-900 by @aarunku5 in #672
- Tasks 1450-1500: Assigning categories, domains and addressing feedback by @yeganehkordi in #658
- Tsk 1361: fix by @amirrezamirzaei in #674
- Fix semeval by @amirrezamirzaei in #675
- adding input language in definition assertion by @amirrezamirzaei in #678
- Task463-500, 1500-1550 by @ghlai9665 in #679
- Fix the imbalanced labels in task 27 by @Palipoor in #681
- Fix imbalanced labels in tasks 200 and 202 by @Palipoor in #682
Full Changelog: v2.1...2.2
V2.1
This release contains:
- A collection of newly added tasks
- A new hierarchy of task types and their domains
- Lots of improvements and fixes to the existing tasks.
We will continue to publish more releases as we improve the data and add more experiments.
If you're wondering whether you can still contribute to the repo, the answer is a solid YES!
What's Changed
- Further update task domains & categories; Add Utility Scripts by @ghlai9665 in #436
- Tasks 739-742: lhoestq dataset question and answer generation by @hanut1909 in #317
- Task 829-833: giga_fren, poem_sentiment and poleval2019_mt by @abhinawale12 in #325
- Tasks 877-881: kde4, schema_guided_dstc8 Datasets by @kashyap467 in #334
- Task 934 Turk Simplification by @cosmicishan in #444
- Tasks 1499-1504 by @swarooprm in #461
- Update zest tasks by @yeganehkordi in #455
- Address human feedback by @Palipoor in #447
- Tasks600-606 by @Nikhitha0911 in #285
- corrections in the evalution task file by @atharva-naik in #460
- Tasks 1443 - 1446: Synthetic data by @kurbster in #449
- Tasks 1394-1397 by @swarooprm in #416
- Tasks 1361-1364 by @swarooprm in #410
- Tasks 1437-1442 : DoQA tasks by @kuntalkumarpal in #448
- Tasks 1356-1360 by @swarooprm in #409
- Tasks906-909 DialogRE Answer Generation by @matthew-huff in #341
- Making the error message informative by @swarooprm in #466
- Tasks 1365-1369 by @swarooprm in #411
- Second PR to Address human feedback 34, 35, 44, 45, 48, 49, 52-58, 167, 201-205 by @Palipoor in #464
- Task 1498 by @sharma121amit in #456
- Task 955 Rewrite simple English wikipedia sentences in more sophisticated English by @ghlai9665 in #354
- Task 1505: root09 lexical semantic relation classification task added by @atharva-naik in #470
- Tasks910-916: Bianet and Imppres by @mirror3 in #342
- Tasks 766-769 craigslist_bargains, qed by @ritvik7 in #321
- Tasks 917-921: CoQA Dataset, code_x_glue Dataset by @ShashankDavalgi in #343
- Task 1308-1313 Amazon review by @cosmicishan in #395
- Address human feedback- tasks 301 - 332 by @Palipoor in #469
- Update categories and domains for tasks 029 - 058 by @ghlai9665 in #459
- Address human feedback Tasks 333 - 352 by @Palipoor in #471
- Tasks 1350-1355 by @swarooprm in #408
- A script to fix unicode characters in files by @Palipoor in #473
- Tasks 1506 and 1507 (get minimal text span of dob of celebrity given bio) by @atharva-naik in #472
- Tasks 1334-1339 by @swarooprm in #405
- Tasks 1370-1377 by @swarooprm in #412
- Tasks 1378-1384 by @swarooprm in #413
- Addressing Human Feedback (Crowdworker Evaluation of Tasks) by @kuntalkumarpal in #476
- Tasks 1340-1343 by @swarooprm in #406
- Tasks 1344-1349 by @swarooprm in #407
- Task 902-905: Deceptive Opinion Spam dataset, hate_speech_offensive by @meghpatel in #340
- Task 820-826: proto_qa & peixian/rtGender by @Mda233 in #323
- Tasks 970-1086: sherliic, prachathai67k and pib by @CodeHime in #360
- Tasks 960-967 by @ssingulu in #357
- Tasks 595-599: MOCHA and CUAD Datasets by @akhilkumargudipoodi in #284
- Correcting definition for Task 078 by @kurbster in #485
- Address human feedback on tasks 374 - 392 by @Palipoor in #479
- Address human feedback, tasks 397 - 400 by @Palipoor in #480
- Task 619-624 Ohsumed abd Onestop_english datasets by @papanisaicharan in #300
- Task 636-639: multi_voz_v22 dataset by @ayushkalani in #304
- Task 645 - 648: Wiki auto and winograd dataset by @khushal1996 in #306
- Tasks 676-684: OPP-115, HopeEDI, Ollie Datasets by @sskp-kaushik in #315
- Task 757-760: msr_sqa dataset by @sarnshreya in #319
- Tasks 838-842: cdt & para_pdt Datasets by @akhileshamara in #327
- Tasks 1218-1282: TED translation task for an additional 65 language directions by @davidstap in #478
- Task 1196-1217: ATOMIC by @yeganehkordi in #390
- Tasks 857-861 by @ashutosh1608 in #331
- Tasks 872-876: opus_xhosanavy, emotion by @aklagoo in #335
- Tasks 886-889: Quail, Rotten Tomatoes and Go Emotions Dataset by @saianirud in #337
- task 868-871 SQL and MSMARCO by @ujjwalaananth in #359
- Update Hierarchy for Task 059-081 by @ghlai9665 in #474
- Task 894- 901: miam, freebase_qa by @karthikmuru in #339
- Tasks 968-969 : Xcopa by @amirrezamirzaei in #358
- Tasks 922-926:Event2Mind and Coached_Conv_Pref dataset by @yashbhokare in #344
- Tasks 929-932 by @karannaik3797 in #347
- Task 956: leetcode 420 by @DZuoShi in #355
- Task 1161-1164 Coda19 by @cosmicishan in #376
- Task 1542 by @pulkitverma25 in #500
- Task 1166-1167 PennTreeBank and Brown Corpus Parts of Speech Tags by @mishra-sid in #380
- Tasks 1168-1185 : Xcopa by @amirrezamirzaei in #386
- Tasks 1283-1284: Evaluating quality and Informativeness of System Generated Reference by @Sujan242 in #393
- Tasks 1323-1330 by @abhilashreddyy in #402
- Tasks 1407-1417: DART, ajgt_twitter_ar, and youtube_caption_corrections. by @RushangKaria in #429
- Revert "Tasks 1407-1417: DART, ajgt_twitter_ar, and youtube_caption_corrections." by @aarunku5 in #512
- Task1447-1453 and Tasks 1479-1497 Drug_Extraction by @Ishani-Mondal in #450
- 576 HW (1-5) by @aarunku5 in #513
- Task 1551 + Missing Task Numbers from Issue #482 by @pulkitverma25 in #502
- task_201_resolve by @aarunku5 in #521
- Address h...
V2.0
We wrapped up our first public expansion effort on October 15, 2021. During this time, the community contributed over 1500 tasks!! 🎉
We will continue to publish more releases as we improve the data and add more experiments.
If you're wondering whether you can still contribute to the repo, the answer is a solid YES!
Note: v1 data is accessible here: https://instructions.apps.allenai.org/
What's Changed
- natural instructions-v1 by @danyaljj in #1
- Updating Task Category in Readme by @swarooprm in #2
- Repeat-copy task by @danyaljj in #4
- Changed output format in examples to non-list, faq by @swarooprm in #9
- Update README.md by @swarooprm in #12
- task 73 by @amirrezamirzaei in #11
- Shailaja tasks by @shailaja183 in #14
- Added 4 tasks using SPLASH dataset and 1 using the CoNaLa dataset. by @kurbster in #15
- Update README.md: clarifying "meaningful contribution" by @danyaljj in #23
- babi dataset task 1- 3 NI tasks (question generation, answer generation, identify supporting fact for QA) by @shailaja183 in #22
- Update README.md: commits not connected to Github profile by @danyaljj in #30
- Adding tasks using examples from CoNaLa dataset. by @kurbster in #29
- Removing repititive examples. by @kurbster in #40
- Repeat instance fix by @shailaja183 in #42
- squad2.0 by @shailaja183 in #32
- Fixing several outstanding issues by @danyaljj in #38
- Task103 facts to story by @Mihir3009 in #34
- Updated files for task 102 and 103 by @Mihir3009 in #49
- Update README.md by @swarooprm in #50
- fixing issue #13, #43, #31, #24 by @swarooprm in #53
- removing randomization (issue #55) by @swarooprm in #56
- Synthetic data by @nrjvarshney in #27
- task108 abuse detection by @Maitreyapatel in #41
- ASSET two tasks 111 and 112 by @Palipoor in #51
- Synthetic frequency tasks by @nrjvarshney in #57
- Task 105 sentence generation by @Mirzyaaliii in #62
- Add
task117_spl_translation_en_de
task by @Mehrad0711 in #66 - task 104 by @amirrezamirzaei in #35
- Update crowdsourcing.md by @danyaljj in #69
- Update task018_mctaco_temporal_reasoning_presence.json by @danyaljj in #67
- Update task019_mctaco_temporal_reasoning_category.json by @danyaljj in #68
- Task115 help classification by @Maitreyapatel in #59
- PIQA Dataset - Two subtasks by @kuntalkumarpal in #17
- Task 67-72: AbductiveNLI by @aarunku5 in #10
- Task 119, 120 and 121: ZEST dataset by @yeganehkordi in #79
- Task 65 -66: Time travel by @nrjvarshney in #8
- Task 126-131: SCAN Dataset by @eshaanpathak in #81
- Task 116 Com2sense task - Commonsense reasoning by @Palipoor in #60
- Task 137-140: Detoxifying LMs Prompt Completion Dataset by @eshaanpathak in #84
- Update README.md by @swarooprm in #86
- Task 141-143: Odd Man Out Dataset by @eshaanpathak in #88
- Task 151-154: Theory of Mind QA Dataset by @eshaanpathak in #91
- Task 155: Count number of nouns/verbs in the given sentence by @nrjvarshney in #92
- Add task 144 from SubjQA by @Palipoor in #89
- Task110_logic2text by @Mihir3009 in #100
- Task 156: CODAH Dataset by @eshaanpathak in #96
- Task 122-125: New Conala tasks by @kurbster in #80
- Task 133-136: WinoWhy (ACL 2020 paper) by @colinzhaoust in #83
- Task 164-165: MCScript by @Mirzyaaliii in #105
- Task166: ClariQ by @Mirzyaaliii in #106
- Task 177: Para-nmt by @Mirzyaaliii in #110
- Task 132: DAIS Dataset by @Mirzyaaliii in #82
- Task 182: DuoRC dataset by @yeganehkordi in #115
- Task 157: Count vowels consonants by @nrjvarshney in #97
- Task 158 - 159: Frequency of words by @nrjvarshney in #98
- Task 160 : Replace letter in sentence by @nrjvarshney in #99
- Task 161-163: Count words that contain/start_with/end_with the given letter by @nrjvarshney in #101
- Task 109 - Spam SMS Detection by @SavanDoshi in #117
- task 170, 191-192: HotpotQA by @Mihir3009 in #107
- Tasks 179 - 181 from the paper "Effective Crowd-Annotation of Participants, Interventions, and Outcomes..." by @Palipoor in #114
- Task 176 and Task 184 from dataset BREAK by @Palipoor in #109
- Task 167-169: StrategyQA dataset by @yeganehkordi in #104
- fix: typo spelling grammar by @slowy07 in #125
- Update README.md by @danyaljj in #126
- Fixed unit test by @kurbster in #124
- Task 118-119: semeval math challenges by @amirrezamirzaei in #76
- Task - 166 ClariQ Correction by @Mirzyaaliii in #134
- Update README.md by @danyaljj in #139
- Task 195-196: Sentiment140 by @Mihir3009 in #122
- Update README.md by @danyaljj in #146
- Task - 178: QuaRTz by @Mirzyaaliii in #111
- Tasks 205-208: New Synthetic tasks by @kurbster in #123
- Add tasks 171-175 by @Mehrad0711 in #108
- adding tasks 145-150 by @liusiyi641 in #90
- Task 193-194: DuoRC by @yeganehkordi in #121
- task 224 by @amirrezamirzaei in #137
- Changing the Expected Schema by @swarooprm in #147
- Task209 stance detection by @Maitreyapatel in #128
- Task 210-212: logic2text by @Mihir3009 in #129
- Task-223: QuaRTz by @Mirzyaaliii in #136
- Task 227: ClariQ by @Mirzyaaliii in ht...