Predicting Transition Words Between Sentence for English and Spanish Medical Text

David Kauchak, Gondy Leroy, Menglu Pei, Sonia Colina

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Transition words add important information and are useful for increasing text comprehension for readers. Our goal is to automatically detect transition words in the medical domain. We introduce a new dataset for identifying transition words categorized into 16 different types with occurrences in adjacent sentence pairs in medical texts from English and Spanish Wikipedia (70K and 27K examples, respectively). We provide classification results using a feedforward neural network with word embedding features. Overall, we detect the need for a transition word with 78% accuracy in English and 84% in Spanish. For individual transition word categories, performance varies widely and is not related to either the number of training examples or the number of transition words in the category. The best accuracy in English was for Examplification words (82%) and in Spanish for Contrast words (96%).

Original languageEnglish (US)
Pages (from-to)523-531
Number of pages9
JournalAMIA ... Annual Symposium proceedings. AMIA Symposium
Volume2019
StatePublished - 2019

ASJC Scopus subject areas

  • General Medicine

Fingerprint

Dive into the research topics of 'Predicting Transition Words Between Sentence for English and Spanish Medical Text'. Together they form a unique fingerprint.

Cite this