TY - GEN
T1 - Multi-hop Inference for Sentence-level TextGraphs
T2 - 12th Workshop on Graph-Based Methods for Natural Language Processing, TextGraphs 2018 - in conjunction with the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human, NAACL HLT 2018
AU - Jansen, Peter A.
N1 - Publisher Copyright:
© 2018 Association for Computational Linguistics.
PY - 2018
Y1 - 2018
N2 - Question Answering for complex questions is often modelled as a graph construction or traversal task, where a solver must build or traverse a graph of facts that answer and explain a given question. This “multi-hop” inference has been shown to be extremely challenging, with few models able to aggregate more than two facts before being overwhelmed by “semantic drift”, or the tendency for long chains of facts to quickly drift off topic. This is a major barrier to current inference models, as even elementary science questions require an average of 4 to 6 facts to answer and explain. In this work we empirically characterize the difficulty of building or traversing a graph of sentences connected by lexical overlap, by evaluating chance sentence aggregation quality through 9,784 manually-annotated judgements across knowledge graphs built from three free-text corpora (including study guides and Simple Wikipedia). We demonstrate semantic drift tends to be high and aggregation quality low, at between 0.04% and 3%, and highlight scenarios that maximize the likelihood of meaningfully combining information.
AB - Question Answering for complex questions is often modelled as a graph construction or traversal task, where a solver must build or traverse a graph of facts that answer and explain a given question. This “multi-hop” inference has been shown to be extremely challenging, with few models able to aggregate more than two facts before being overwhelmed by “semantic drift”, or the tendency for long chains of facts to quickly drift off topic. This is a major barrier to current inference models, as even elementary science questions require an average of 4 to 6 facts to answer and explain. In this work we empirically characterize the difficulty of building or traversing a graph of sentences connected by lexical overlap, by evaluating chance sentence aggregation quality through 9,784 manually-annotated judgements across knowledge graphs built from three free-text corpora (including study guides and Simple Wikipedia). We demonstrate semantic drift tends to be high and aggregation quality low, at between 0.04% and 3%, and highlight scenarios that maximize the likelihood of meaningfully combining information.
UR - http://www.scopus.com/inward/record.url?scp=85180130011&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85180130011&partnerID=8YFLogxK
U2 - 10.18653/v1/w18-1703
DO - 10.18653/v1/w18-1703
M3 - Conference contribution
AN - SCOPUS:85180130011
T3 - NAACL HLT 2018 - Graph-Based Methods for Natural Language Processing, TextGraphs 2018 - Proceedings of the 12th Workshop
SP - 12
EP - 17
BT - NAACL HLT 2018 - Graph-Based Methods for Natural Language Processing, TextGraphs 2018 - Proceedings of the 12th Workshop
A2 - Glavas, Goran
A2 - Somasundaran, Swapna
A2 - Riedl, Martin
A2 - Hovy, Eduard
PB - Association for Computational Linguistics
Y2 - 6 June 2018
ER -