Term familiarity to indicate perceived and actual difficulty of text in medical digital libraries

Gondy Leroy, James E. Endicott

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Scopus citations

Abstract

With increasing text digitization, digital libraries can personalize materials for individuals with different education levels and language skills. To this end, documents need meta-information describing their difficulty level. Previous attempts at such labeling used readability formulas but the formulas have not been validated with modern texts and their outcome is seldom associated with actual difficulty. We focus on medical texts and are developing new, evidence-based meta-tags that are associated with perceived and actual text difficulty. This work describes a first tag, 'term familiarity', which is based on term frequency in the Google corpus. We evaluated its feasibility to serve as a tag by looking at a document corpus (N=1,073) and found that terms in blogs or journal articles displayed unexpected but significantly different scores. Term familiarity was then applied to texts and results from a previous user study (N=86) and could better explain differences for perceived and actual difficulty.

Original languageEnglish (US)
Title of host publicationDigital Libraries
Subtitle of host publicationFor Cultural Heritage, Knowledge Dissemination, and Future Creation - 13th International Conference on Asia-Pacific Digital Libraries, ICADL 2011, Proceedings
Pages307-310
Number of pages4
DOIs
StatePublished - 2011
Externally publishedYes
Event13th International Conference on Asia-Pacific Digital Libraries, ICADL 2011 - Beijing, China
Duration: Oct 24 2011Oct 27 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7008 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other13th International Conference on Asia-Pacific Digital Libraries, ICADL 2011
Country/TerritoryChina
CityBeijing
Period10/24/1110/27/11

Keywords

  • Actual Difficulty
  • Health Informatics
  • Lexical Tags
  • Meta Information
  • Natural Language Processing
  • Perceived Difficulty

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Term familiarity to indicate perceived and actual difficulty of text in medical digital libraries'. Together they form a unique fingerprint.

Cite this