This repository has been archived by the owner on Dec 13, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1
/
cfp.txt
100 lines (82 loc) · 4.75 KB
/
cfp.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
Call for Participation:
TextGraphs-15: Third Shared Task on Explanation Regeneration
https://competitions.codalab.org/competitions/29228
We invite participation in the 3rd Shared Task on Explanation Regeneration
associated with the 15th Workshop on Graph-Based Natural Language Processing
(TextGraphs-15).
All systems participating in the shared task will be invited to submit system description papers.
Each system description paper will be peer-reviewed by (two) other participating teams and will be
presented as a poster at the main workshop: https://sites.google.com/view/textgraphs2021.
Overview
========
Multi-hop inference is the task of combining more than one piece of information
to solve an inference task, such as question answering. This can take many
forms, from combining free-text sentences read from books or the web, to
combining linked facts from a structured knowledge base. The Shared Task
on Explanation Regeneration asks participants to develop methods that
reconstruct large explanations for science questions, using a corpus of
gold explanations that provides supervision and instrumentation for this
multi-hop inference task. Each explanation is represented as an
"explanation graph", a set of atomic facts (between 1 and 16 per explanation,
drawn from a knowledge base of 9,000 facts) that, together, form a detailed
explanation for the reasoning required to answer and explain the resoning
behind a question.
Explanation Regeneration is a stepping-stone towards general multi-hop inference
over language. In this shared task we frame explanation regeneration as a
ranking task, where the inputs to a given system consist of questions and their
correct answers. Participating systems must then rank atomic facts from a
provided semi-structured knowledge base such that the combination of top-ranked
facts provide a detailed explanation for the answer. This requires combining
scientific and common-sense/world knowledge with compositional inference.
While large language models (BERT, ERNIE) achieved the highest performance in
the 2019 and 2020 shared tasks, substantially advancing the state-of-the-art
over previous methods, absolute performance remains modest, highlighting the
difficulty of generating detailed explanations through multi-hop reasoning.
*NEW FOR 2021:*
Many-hop multi-hop inference is challenging because there are often multiple
ways of assembling a good explanation for a given question. This 2021
instantiation of the shared task focuses on the theme of determining relevance
versus completeness in large multi-hop explanations. To this end, this year
we include a very large dataset of approximately 250,000 expert-annotated
relevancy ratings for facts ranked highly by baseline language models from
previous years (e.g. BERT, RoBERTa).
Submissions using a variety of methods (graph-based or otherwise) are
encouraged. Submissions that evaluate how well existing models designed on
2-hop multihop question answering datasets (e.g. HotPotQA, QASC, etc) perform
at many-fact multi-hop explanation regeneration are welcome.
Example
=======
For example, for the question: "Which of the following is an example of an
organism taking in nutrients?" with the correct answer: "a girl eating an
apple", an ideal system would rank the following explanatory statements
at the top of its extracted sentences:
1. A girl means a human girl.
2. Humans are living organisms.
3. Eating is when an organism takes in nutrients in the form of food.
4. Fruits are kinds of foods.
5. An apple is a kind of fruit.
Important Dates
===============
* 2021-02-15: Training data release
* 2021-03-10: Test data release; Evaluation start
* 2021-03-24: Evaluation end
* 2021-04-01: System description paper deadline
* 2021-04-13: Deadline for reviews of system description papers
* 2021-04-15: Author notifications
* 2021-04-26: Camera-ready description paper deadline
* 2021-06-11: TextGraphs-15 workshop
Data
====
The data used in this shared task contains approximately 5,100 questions drawn
from the AI2 Reasoning Challenge (ARC) dataset (Clark et al., 2018), together
with explanation sentences for their correct answers drawn from the
WorldTree V2 corpus (Xie et al., 2020, Jansen et al., 2018). For 2021, this
has been augmented with a new set of approximately 250k expert-generated relevancy
ratings.
The knowledge base supporting these questions contains approximately 9,000 facts.
To encourage a variety of solving methods, the knowledge base is available both as
plain-text sentences (unstructured) as well as semi-structured tables. Facts are a
combination of scientific knowledge as well as common-sense/world knowledge.
Please see the shared task website for more details:
https://competitions.codalab.org/competitions/29228
https://github.com/cognitiveailab/tg2021task