TY - GEN
T1 - CLULab-UofA at SemEval-2024 Task 8
T2 - 18th International Workshop on Semantic Evaluation, SemEval 2024, co-located with the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL 2024
AU - Rezaei, Mohammad Hossein
AU - Kwon, Yeaeun
AU - Sanayei, Reza
AU - Singh, Abhyuday
AU - Bethard, Steven
N1 - Publisher Copyright:
© 2024 Association for Computational Linguistics.
PY - 2024
Y1 - 2024
N2 - Detecting machine-generated text is a critical task in the era of large language models. In this paper, we present our systems for SemEval-2024 Task 8, which focuses on multi-class classification to discern between human-written and maching-generated texts by five state-of-the-art large language models. We propose three different systems: unsupervised text similarity, triplet-loss-trained text similarity, and text classification. We show that the triplet-loss-trained text similarity system outperforms the other systems, achieving 80% accuracy on the test set and surpassing the baseline model for this subtask. Additionally, our text classification system, which takes into account sentence paraphrases generated by the candidate models, also outperforms the unsupervised text similarity system, achieving 74% accuracy.
AB - Detecting machine-generated text is a critical task in the era of large language models. In this paper, we present our systems for SemEval-2024 Task 8, which focuses on multi-class classification to discern between human-written and maching-generated texts by five state-of-the-art large language models. We propose three different systems: unsupervised text similarity, triplet-loss-trained text similarity, and text classification. We show that the triplet-loss-trained text similarity system outperforms the other systems, achieving 80% accuracy on the test set and surpassing the baseline model for this subtask. Additionally, our text classification system, which takes into account sentence paraphrases generated by the candidate models, also outperforms the unsupervised text similarity system, achieving 74% accuracy.
UR - http://www.scopus.com/inward/record.url?scp=85215534620&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85215534620&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85215534620
T3 - SemEval 2024 - 18th International Workshop on Semantic Evaluation, Proceedings of the Workshop
SP - 1498
EP - 1504
BT - SemEval 2024 - 18th International Workshop on Semantic Evaluation, Proceedings of the Workshop
A2 - Ojha, Atul Kr.
A2 - Dohruoz, A. Seza
A2 - Madabushi, Harish Tayyar
A2 - Da San Martino, Giovanni
A2 - Rosenthal, Sara
A2 - Rosa, Aiala
PB - Association for Computational Linguistics (ACL)
Y2 - 20 June 2024 through 21 June 2024
ER -