TY - GEN
T1 - Students Who Study Together Learn Better
T2 - 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021
AU - Mithun, Mitch Paul
AU - Suntwal, Sandeep
AU - Surdeanu, Mihai
N1 - Funding Information:
This work was supported by the Defense Advanced Research Projects Agency (DARPA) under the World Modelers and HABITUS programs. Mi-hai Surdeanu declares a financial interest in lum.ai. This interest has been properly disclosed to the University of Arizona Institutional Review Committee, and is managed in accordance with its conflict of interest policies. The authors would also like to thank Steve Bethard, Becky Sharp, and Marco Valenzuela-Escárcega for all their valuable comments and reviews.
Funding Information:
This work was supported by the Defense Advanced Research Projects Agency (DARPA) under the World Modelers and HABITUS programs. Mihai Surdeanu declares a financial interest in lum.ai. This interest has been properly disclosed to the University of Arizona Institutional Review Committee, and is managed in accordance with its conflict of interest policies. The authors would also like to thank Steve Bethard, Becky Sharp, and Marco Valenzuela-Esc?rcega for all their valuable comments and reviews.
Publisher Copyright:
© 2021 Association for Computational Linguistics
PY - 2021
Y1 - 2021
N2 - While neural networks produce state-of-the-art performance in several NLP tasks, they depend heavily on lexicalized information, which transfers poorly between domains. Previous work (Suntwal et al., 2019) proposed delexicalization as a form of knowledge distillation to reduce dependency on such lexical artifacts. However, a critical unsolved issue that remains is how much delexicalization should be applied? A little helps reduce over-fitting, but too much discards useful information. We propose Group Learning (GL), a knowledge and model distillation approach for fact verification. In our method, while multiple student models have access to different delexicalized data views, they are encouraged to independently learn from each other through pair-wise consistency losses. In several cross-domain experiments between the FEVER and FNC fact verification datasets, we show that our approach learns the best delexicalization strategy for the given training dataset and outperforms state-of-the-art classifiers that rely on the original data.
AB - While neural networks produce state-of-the-art performance in several NLP tasks, they depend heavily on lexicalized information, which transfers poorly between domains. Previous work (Suntwal et al., 2019) proposed delexicalization as a form of knowledge distillation to reduce dependency on such lexical artifacts. However, a critical unsolved issue that remains is how much delexicalization should be applied? A little helps reduce over-fitting, but too much discards useful information. We propose Group Learning (GL), a knowledge and model distillation approach for fact verification. In our method, while multiple student models have access to different delexicalized data views, they are encouraged to independently learn from each other through pair-wise consistency losses. In several cross-domain experiments between the FEVER and FNC fact verification datasets, we show that our approach learns the best delexicalization strategy for the given training dataset and outperforms state-of-the-art classifiers that rely on the original data.
UR - http://www.scopus.com/inward/record.url?scp=85127393430&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85127393430&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85127393430
T3 - EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings
SP - 6968
EP - 6973
BT - EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings
PB - Association for Computational Linguistics (ACL)
Y2 - 7 November 2021 through 11 November 2021
ER -