The effect of word familiarity on actual and perceived text difficulty

Gondy Leroy, David Kauchak

Research output: Contribution to journalArticlepeer-review

38 Scopus citations


There is little evidence that readability formula outcomes relate to text understanding. The potential cause may lie in their strong reliance on word and sentence length. We evaluated word familiarity rather than word length as a stand-in for word difficulty. Word familiarity represents how well known a word is, and is estimated using word frequency in a large text corpus, in this work the Google web corpus. We conducted a study with 239 people, who provided 50 evaluations for each of 275 words. Our study is the first study to focus on actual difficulty, measured with a multiple-choice task, in addition to perceived difficulty, measured with a Likert scale. Actual difficulty was correlated with word familiarity (r=0.219, p<0.001) but not with word length (r=-0.075, p=0.107). Perceived difficulty was correlated with both word familiarity (r=-0.397, p<0.001) and word length (r=0.254, p<0.001).

Original languageEnglish (US)
Pages (from-to)e169-e172
JournalJournal of the American Medical Informatics Association
Issue numberE2
StatePublished - 2014

ASJC Scopus subject areas

  • Health Informatics


Dive into the research topics of 'The effect of word familiarity on actual and perceived text difficulty'. Together they form a unique fingerprint.

Cite this