TY - GEN
T1 - How May i Help You? Using Neural Text Simplification to Improve Downstream NLP Tasks
AU - Van, Hoang
AU - Tang, Zheng
AU - Surdeanu, Mihai
N1 - Funding Information:
This work was supported by the Defense Advanced Research Projects Agency (DARPA) under the World Modelers and HABITUS programs. Mi-hai Surdeanu declares a financial interest in lum.ai. This interest has been properly disclosed to the University of Arizona Institutional Review Committee, and is managed in accordance with its conflict of interest policies.
Publisher Copyright:
© 2021 Association for Computational Linguistics.
PY - 2021
Y1 - 2021
N2 - The general goal of text simplification (TS) is to reduce text complexity for human consumption. In this paper, we investigate another potential use of neural TS: assisting machines performing natural language processing (NLP) tasks. We evaluate the use of neural TS in two ways: simplifying input texts at prediction time and augmenting data to provide machines with additional information during training. We demonstrate that the latter scenario provides positive effects on machine performance on two separate datasets. In particular, the latter use of TS significantly improves the performances of LSTM (1.82-1.98%) and SpanBERT (0.7-1.3%) extractors on TACRED, a complex, large-scale, real-world relation extraction task. Further, the same setting yields significant improvements of up to 0:65% matched and 0:62% mismatched accuracies for a BERT text classifier on MNLI, a practical natural language inference dataset.
AB - The general goal of text simplification (TS) is to reduce text complexity for human consumption. In this paper, we investigate another potential use of neural TS: assisting machines performing natural language processing (NLP) tasks. We evaluate the use of neural TS in two ways: simplifying input texts at prediction time and augmenting data to provide machines with additional information during training. We demonstrate that the latter scenario provides positive effects on machine performance on two separate datasets. In particular, the latter use of TS significantly improves the performances of LSTM (1.82-1.98%) and SpanBERT (0.7-1.3%) extractors on TACRED, a complex, large-scale, real-world relation extraction task. Further, the same setting yields significant improvements of up to 0:65% matched and 0:62% mismatched accuracies for a BERT text classifier on MNLI, a practical natural language inference dataset.
UR - http://www.scopus.com/inward/record.url?scp=85129220971&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85129220971&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85129220971
T3 - Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021
SP - 4074
EP - 4080
BT - Findings of the Association for Computational Linguistics, Findings of ACL
A2 - Moens, Marie-Francine
A2 - Huang, Xuanjing
A2 - Specia, Lucia
A2 - Yih, Scott Wen-Tau
PB - Association for Computational Linguistics (ACL)
T2 - 2021 Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021
Y2 - 7 November 2021 through 11 November 2021
ER -