TY - GEN
T1 - Do pretrained transformers infer telicity like humans?
AU - Zhao, Yiyun
AU - Ngui, Jian Gang
AU - Hartley, Lucy Hall
AU - Bethard, Steven
N1 - Funding Information:
Research reported in this publication was supported by the National Library of Medicine of the National Institutes of Health under Award Number R01LM010090. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Publisher Copyright:
© 2021 Association for Computational Linguistics.
PY - 2021
Y1 - 2021
N2 - Pretrained transformer-based language models achieve state-of-the-art performance in many NLP tasks, but it is an open question whether the knowledge acquired by the models during pretraining resembles the linguistic knowledge of humans. We present both humans and pretrained transformers with descriptions of events, and measure their preference for telic interpretations (the event has a natural endpoint) or atelic interpretations (the event does not have a natural endpoint). To measure these preferences and determine what factors influence them, we design an English test and a novel-word test that include a variety of linguistic cues (noun phrase quantity, resultative structure, contextual information, temporal units) that bias toward certain interpretations. We find that humans’ choice of telicity interpretation is reliably influenced by theoretically-motivated cues, transformer models (BERT and RoBERTa) are influenced by some (though not all) of the cues, and transformer models often rely more heavily on temporal units than humans do.
AB - Pretrained transformer-based language models achieve state-of-the-art performance in many NLP tasks, but it is an open question whether the knowledge acquired by the models during pretraining resembles the linguistic knowledge of humans. We present both humans and pretrained transformers with descriptions of events, and measure their preference for telic interpretations (the event has a natural endpoint) or atelic interpretations (the event does not have a natural endpoint). To measure these preferences and determine what factors influence them, we design an English test and a novel-word test that include a variety of linguistic cues (noun phrase quantity, resultative structure, contextual information, temporal units) that bias toward certain interpretations. We find that humans’ choice of telicity interpretation is reliably influenced by theoretically-motivated cues, transformer models (BERT and RoBERTa) are influenced by some (though not all) of the cues, and transformer models often rely more heavily on temporal units than humans do.
UR - http://www.scopus.com/inward/record.url?scp=85128881849&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85128881849&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85128881849
T3 - CoNLL 2021 - 25th Conference on Computational Natural Language Learning, Proceedings
SP - 72
EP - 81
BT - CoNLL 2021 - 25th Conference on Computational Natural Language Learning, Proceedings
A2 - Bisazza, Arianna
A2 - Abend, Omri
PB - Association for Computational Linguistics (ACL)
T2 - 25th Conference on Computational Natural Language Learning, CoNLL 2021
Y2 - 10 November 2021 through 11 November 2021
ER -