DLS@CU-CORE: A Simple Machine Learning Model of Semantic Textual Similarity

Md Arafat Sultan, Steven Bethard, Tamara Sumner

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

We present a system submitted in the Semantic Textual Similarity (STS) task at the Second Joint Conference on Lexical and Computational Semantics (∗SEM 2013). Given two short text fragments, the goal of the system is to determine their semantic similarity. Our system makes use of three different measures of text similarity: word n-gram overlap, character n-gram overlap and semantic overlap. Using these measures as features, it trains a support vector regression model on SemEval STS 2012 data. This model is then applied on the STS 2013 data to compute textual similarities. Two different selections of training data result in very different performance levels: while a correlation of 0.4135 with gold standards was observed in the official evaluation (ranked 63rd among all systems) for one selection, the other resulted in a correlation of 0.5352 (that would rank 21st).

Original languageEnglish (US)
Title of host publicationSEM 2013 - 2nd Joint Conference on Lexical and Computational Semantics, Proceedings of the Main Conference and the Shared Task
Subtitle of host publicationSemantic Textual SimilaritySEM 2013 - 2nd Joint Conference on Lexical and Computational Semantics, Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity
EditorsMona Diab, Tim Baldwin, Marco Baroni
PublisherAssociation for Computational Linguistics (ACL)
Pages176-180
Number of pages5
ISBN (Electronic)9781937284480
StatePublished - 2013
Externally publishedYes
Event2nd Joint Conference on Lexical and Computational Semantics, SEM 2013 - Atlanta, United States
Duration: Jun 13 2013Jun 14 2013

Publication series

NameSEM 2013 - 2nd Joint Conference on Lexical and Computational Semantics, Proceedings of the Main Conference and the Shared Task: Semantic Textual SimilaritySEM 2013 - 2nd Joint Conference on Lexical and Computational Semantics, Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity

Conference

Conference2nd Joint Conference on Lexical and Computational Semantics, SEM 2013
Country/TerritoryUnited States
CityAtlanta
Period6/13/136/14/13

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'DLS@CU-CORE: A Simple Machine Learning Model of Semantic Textual Similarity'. Together they form a unique fingerprint.

Cite this