TY - GEN
T1 - Data and Model Distillation as a Solution for Domain-transferable Fact Verification
AU - Mithun, Mitch Paul
AU - Suntwal, Sandeep
AU - Surdeanu, Mihai
N1 - Funding Information:
This work was supported by the Defense Advanced Research Projects Agency (DARPA) under the World Modelers program, grant number W911NF1810014, and by the Bill and Melinda Gates Foundation HBGDki Initiative. Mihai Sur-deanu declares a financial interest in lum.ai. This interest has been properly disclosed to the University of Arizona Institutional Review Committee and is managed in accordance with its conflict of interest policies. The authors would also like to thank Becky Sharp and Marco Valenzuela-Escárcega for all their valuable comments and reviews.
Funding Information:
This work was supported by the Defense Advanced Research Projects Agency (DARPA) under the World Modelers program, grant number W911NF1810014, and by the Bill and Melinda Gates Foundation HBGDki Initiative. Mihai Surdeanu declares a financial interest in lum.ai. This interest has been properly disclosed to the University of Arizona Institutional Review Committee and is managed in accordance with its conflict of interest policies. The authors would also like to thank Becky Sharp and Marco Valenzuela-Escárcega for all their valuable comments and reviews.
Publisher Copyright:
© 2021 Association for Computational Linguistics.
PY - 2021
Y1 - 2021
N2 - While neural networks produce state-of-the-art performance in several NLP tasks, they generally depend heavily on lexicalized information, which transfer poorly between domains. We present a combination of two strategies to mitigate this dependence on lexicalized information in fact verification tasks. We present a data distillation technique for delexicalization, which we then combine with a model distillation method to prevent aggressive data distillation. We show that by using our solution, not only does the performance of an existing state-of-the-art model remain at par with that of the model trained on a fully lexicalized data, but it also performs better than it when tested out of domain. We show that the technique we present encourages models to extract transferable facts from a given fact verification dataset.
AB - While neural networks produce state-of-the-art performance in several NLP tasks, they generally depend heavily on lexicalized information, which transfer poorly between domains. We present a combination of two strategies to mitigate this dependence on lexicalized information in fact verification tasks. We present a data distillation technique for delexicalization, which we then combine with a model distillation method to prevent aggressive data distillation. We show that by using our solution, not only does the performance of an existing state-of-the-art model remain at par with that of the model trained on a fully lexicalized data, but it also performs better than it when tested out of domain. We show that the technique we present encourages models to extract transferable facts from a given fact verification dataset.
UR - http://www.scopus.com/inward/record.url?scp=85127454868&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85127454868&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85127454868
T3 - NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference
SP - 4546
EP - 4552
BT - NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics
PB - Association for Computational Linguistics (ACL)
T2 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021
Y2 - 6 June 2021 through 11 June 2021
ER -