An Analysis of Bootstrapping for the Recognition of Temporal Expressions

Jordi Poveda, Mihai Surdeanu, Jordi Turmo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

We present a semi-supervised (bootstrapping) approach to the extraction of time expression mentions in large unlabelled corpora. Because the only supervision is in the form of seed examples, it becomes necessary to resort to heuristics to rank and filter out spurious patterns and candidate time expressions. The application of bootstrapping to time expression recognition is, to the best of our knowledge, novel. In this paper, we describe one such architecture for bootstrapping Information Extraction (IE) patterns —suited to the extraction of entities, as opposed to events or relations— and summarize our experimental findings. These point out to the fact that a pattern set with a good increase in recall with respect to the seeds is achievable within our framework while, on the other side, the decrease in precision in successive iterations is succesfully controlled through the use of ranking and selection heuristics. Experiments are still underway to achieve the best use of these heuristics and other parameters of the bootstrapping algorithm.

Original languageEnglish (US)
Title of host publicationNAACL HLT 2009 - Semi-Supervised Learning for Natural Language Processing, Proceedings of the Workshop
EditorsQin Iris Wang, Kevin Duh, Dekang Lin
PublisherAssociation for Computational Linguistics (ACL)
Pages49-57
Number of pages9
ISBN (Electronic)9781932432381
DOIs
StatePublished - 2009
Externally publishedYes
Event2009 Semi-Supervised Learning for Natural Language Processing, SSL-NLP2009 - Boulder, United States
Duration: Jun 4 2009 → …

Publication series

NameNAACL HLT 2009 - Semi-Supervised Learning for Natural Language Processing, Proceedings of the Workshop

Conference

Conference2009 Semi-Supervised Learning for Natural Language Processing, SSL-NLP2009
Country/TerritoryUnited States
CityBoulder
Period6/4/09 → …

ASJC Scopus subject areas

  • Computer Science Applications
  • Software
  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'An Analysis of Bootstrapping for the Recognition of Temporal Expressions'. Together they form a unique fingerprint.

Cite this