Transition words add important information and are useful for increasing text comprehension for readers. Our goal is to automatically detect transition words in the medical domain. We introduce a new dataset for identifying transition words categorized into 16 different types with occurrences in adjacent sentence pairs in medical texts from English and Spanish Wikipedia (70K and 27K examples, respectively). We provide classification results using a feedforward neural network with word embedding features. Overall, we detect the need for a transition word with 78% accuracy in English and 84% in Spanish. For individual transition word categories, performance varies widely and is not related to either the number of training examples or the number of transition words in the category. The best accuracy in English was for Examplification words (82%) and in Spanish for Contrast words (96%).
|Original language||English (US)|
|Number of pages||9|
|Journal||AMIA ... Annual Symposium proceedings. AMIA Symposium|
|State||Published - 2019|
ASJC Scopus subject areas