Large-scale evaluation of text features affecting perceived and actual text diffi

Project: Research project

Grant Details


DESCRIPTION (provided by applicant): With increasingly more medical tests and treatments being available, more patients being diagnosed with chronic diseases that require life-long management, and increasing pressure on clinicians to see more patients in a limited amount of time, it is essential that patients learn and understand how best to take care of their health. Unfortunately, an estimated 89 million people have insufficient health literacy to do just that and the associated costs are estimated to be into the billions of dollars each year. Although there are many exciting opportunities to educate consumers, ranging from visualization to virtual environments, text is still the most efficient and cost-effective medium available to all groups in society. Unfortunately, existing writing guidelines, relying heavily on readability formulas, have not been shown to impact text difficulty or consumer understanding. Objectives The objectives of this project are therefore to address and overcome existing barriers in current readability research. We address barriers as follows: we will work 1) with representative consumers, not experts evaluating on behalf of consumers, 2) using a large sample with thousands of participants working in their own settings, not a laboratory study with few participants in an artificial environment, and 3) evaluate both perceived and actual difficulty of text, two variables often confounded in studies, 4) to discover features that can be automatically discovered in text so that difficulty checkers can be developed that allow clinicians to efficiently and effectively optimize their text without requiring study of guidelines or linguistics. Design The study will be conducted using modern technology and resources. Over a time period of two years, master and doctoral level students will design and conduct the studies together with the principle investigator. The study brings together insights from computational linguistics and information science applied to medical informatics. Starting from a linguistic perspective, we will we systematically list good candidates of text features that may influence understanding. We will look at features of grammar, semantics, and compositions of text. By using a modern market place, Amazon's Mechanical Turk, we can involve thousands of participants in the study. We will each pair-wise comparisons of the features, e.g., sentences with high versus low topic density. Using a within-subjects design, we will measure perceived and actual difficulty of each feature with multiple choice question-answering tasks. PUBLIC HEALTH RELEVANCE: Current health information is not attuned to the reading skills of consumers a situation that contributes to low health literacy and higher healthcare costs because of costly mistakes and unwise decisions. The proposed project will work with thousands of representative consumers to discover text features that influence perceived and actual difficulty of text. These features will lead to better writing guidelines and automated tools to help clinicians write health educational materials that are easier to understand and that are based on demonstrated impacts of the features on understanding.
Effective start/end date3/15/113/14/14


  • National Institutes of Health: $31,312.00
  • National Institutes of Health: $33,515.00
  • National Institutes of Health: $66,150.00


  • Medicine(all)
  • Health Professions(all)


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.