TY - GEN
T1 - hinoki at SemEval-2024 Task 7
T2 - 18th International Workshop on Semantic Evaluation, SemEval 2024, co-located with the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL 2024
AU - Crum, Hinoki
AU - Bethard, Steven
N1 - Publisher Copyright:
© 2024 Association for Computational Linguistics.
PY - 2024
Y1 - 2024
N2 - Numerical reasoning is challenging even for large pre-trained language models. We show that while T5 models are capable of generating relevant headlines with proper numerical values, they can also make mistakes in reading comprehension and miscalculate numerical values. To overcome these issues, we propose a two-step training process: first train models to read text and generate formal representations of calculations, then train models to read calculations and generate numerical values. On the SemEval 2024 Task 7 headline fill-in-the-blank task, our two-stage Flan-T5-based approach achieved 88% accuracy. On the headline generation task, our T5-based approach achieved RougeL of 0.390, BERT F1 Score of 0.453, and MoverScore of 0.587.
AB - Numerical reasoning is challenging even for large pre-trained language models. We show that while T5 models are capable of generating relevant headlines with proper numerical values, they can also make mistakes in reading comprehension and miscalculate numerical values. To overcome these issues, we propose a two-step training process: first train models to read text and generate formal representations of calculations, then train models to read calculations and generate numerical values. On the SemEval 2024 Task 7 headline fill-in-the-blank task, our two-stage Flan-T5-based approach achieved 88% accuracy. On the headline generation task, our T5-based approach achieved RougeL of 0.390, BERT F1 Score of 0.453, and MoverScore of 0.587.
UR - http://www.scopus.com/inward/record.url?scp=85215506011&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85215506011&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85215506011
T3 - SemEval 2024 - 18th International Workshop on Semantic Evaluation, Proceedings of the Workshop
SP - 34
EP - 39
BT - SemEval 2024 - 18th International Workshop on Semantic Evaluation, Proceedings of the Workshop
A2 - Ojha, Atul Kr.
A2 - Dohruoz, A. Seza
A2 - Madabushi, Harish Tayyar
A2 - Da San Martino, Giovanni
A2 - Rosenthal, Sara
A2 - Rosa, Aiala
PB - Association for Computational Linguistics (ACL)
Y2 - 20 June 2024 through 21 June 2024
ER -