Abstract
Text continues to be an important medium for communicating health-related information. We have built a text simplification tool that gives concrete suggestions on how to simplify health and medical texts. An important component of the tool identifies difficult words and suggests simpler synonyms based on pre-existing resources (WordNet and UMLS). These candidate substitutions are not always appropriate in all contexts. In this paper, we introduce a filtering algorithm that utilizes semantic similarity based on word embeddings to determine if the candidate substitution is appropriate in the context of the text. We provide an analysis of our approach on a new dataset of 788 labeled substitution examples. The filtering algorithm is particularly helpful at removing obvious examples and can improve the precision by 3% at a recall level of 95%.
Original language | English (US) |
---|---|
Pages (from-to) | 284-292 |
Number of pages | 9 |
Journal | AMIA ... Annual Symposium proceedings. AMIA Symposium |
Volume | 2022 |
State | Published - 2022 |
ASJC Scopus subject areas
- General Medicine