Snaptogrid: From statistical to interpretable models for biomedical information extraction

Marco A. Valenzuela-Escárcega, Gus Hahn-Powell, Dane Bell, Mihai Surdeanu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

We propose an approach for biomedical information extraction that marries the advantages of machine learning models, e.g., learning directly from data, with the benefits of rule-based approaches, e.g., interpretability. Our approach starts by training a feature-based statistical model, then converts this model to a rule-based variant by converting its features to rules, and "snapping to grid" the feature weights to discrete votes. In doing so, our proposal takes advantage of the large body of work in machine learning, but it produces an interpretable model, which can be directly edited by experts. We evaluate our approach on the BioNLP 2009 event extraction task. Our results show that there is a small performance penalty when converting the statistical model to rules, but the gain in interpretability compensates for that: with minimal effort, human experts improve this model to have similar performance to the statistical model that served as starting point.

Original languageEnglish (US)
Title of host publicationBioNLP 2016 - Proceedings of the 15th Workshop on Biomedical Natural Language Processing
EditorsKevin Bretonnel Cohen, Dina Demner-Fushman, Sophia Ananiadou, Jun-ichi Tsujii
PublisherAssociation for Computational Linguistics (ACL)
Pages56-65
Number of pages10
ISBN (Electronic)9781945626128
StatePublished - 2016
Event15th Workshop on Biomedical Natural Language Processing, BioNLP 2016 - Berlin, Germany
Duration: Aug 12 2016 → …

Publication series

NameBioNLP 2016 - Proceedings of the 15th Workshop on Biomedical Natural Language Processing

Conference

Conference15th Workshop on Biomedical Natural Language Processing, BioNLP 2016
Country/TerritoryGermany
CityBerlin
Period8/12/16 → …

ASJC Scopus subject areas

  • Biomedical Engineering
  • Language and Linguistics
  • Information Systems
  • Software
  • Health Informatics
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Snaptogrid: From statistical to interpretable models for biomedical information extraction'. Together they form a unique fingerprint.

Cite this