Effects on text simplification: Evaluation of splitting up noun phrases

Gondy Leroy, David Kauchak, Alan Hogue

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

To help increase health literacy, we are developing a text simplification tool that creates more accessible patient education materials. Tool development is guided by a data-driven feature analysis comparing simple and difficult text. In the present study, we focus on the common advice to split long noun phrases. Our previous corpus analysis showed that easier texts contained shorter noun phrases. Subsequently, we conducted a user study to measure the difficulty of sentences containing noun phrases of different lengths (2-gram, 3-gram, and 4-gram); noun phrases of different conditions (split or not); and, to simulate unknown terms, pseudowords (present or not). We gathered 35 evaluations for 30 sentences in each condition (3 × 2 × 2 conditions) on Amazons Mechanical Turk (N = 12,600). We conducted a 3-way analysis of variance for perceived and actual difficulty. Splitting noun phrases had a positive effect on perceived difficulty but a negative effect on actual difficulty. The presence of pseudowords increased perceived and actual difficulty. Without pseudowords, longer noun phrases led to increased perceived and actual difficulty. A follow-up study using the phrases (N = 1,350) showed that measuring awkwardness may indicate when to split noun phrases. We conclude that splitting noun phrases benefits perceived difficulty but hurts actual difficulty when the phrasing becomes less natural.

Original languageEnglish (US)
Pages (from-to)18-26
Number of pages9
JournalJournal of Health Communication
Volume21
DOIs
StatePublished - Mar 28 2016

ASJC Scopus subject areas

  • Health(social science)
  • Communication
  • Public Health, Environmental and Occupational Health
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'Effects on text simplification: Evaluation of splitting up noun phrases'. Together they form a unique fingerprint.

Cite this