TY - GEN
T1 - A classifier to evaluate language specificity of medical documents
AU - Miller, Trudi
AU - Leroy, Gondy
AU - Chatterjee, Samir
AU - Jie, Fan
AU - Thoms, Brian
PY - 2007
Y1 - 2007
N2 - Consumer health information written by health care professionals is often inaccessible to the consumers it is written for. Traditional readability formulas examine syntactic features like sentence length and number of syllables, ignoring the target audience's grasp of the words themselves. The use of specialized vocabulary disrupts the understanding of patients with low reading skills, causing a decrease in comprehension. A naïve Bayes classifier for three levels of increasing medical terminology specificity (consumer/patient, novice health learner, medical professional) was created with a lexicon generated from a representative medical corpus. Ninety-six percent accuracy in classification was attained. The classifier was then applied to existing consumer health web pages. We found that only 4% of pages were classified at a layperson level, regardless of the Flesch reading ease scores, while the remaining pages were at the level of medical professionals. This indicates that consumer health web pages are not using appropriate language for their target audience.
AB - Consumer health information written by health care professionals is often inaccessible to the consumers it is written for. Traditional readability formulas examine syntactic features like sentence length and number of syllables, ignoring the target audience's grasp of the words themselves. The use of specialized vocabulary disrupts the understanding of patients with low reading skills, causing a decrease in comprehension. A naïve Bayes classifier for three levels of increasing medical terminology specificity (consumer/patient, novice health learner, medical professional) was created with a lexicon generated from a representative medical corpus. Ninety-six percent accuracy in classification was attained. The classifier was then applied to existing consumer health web pages. We found that only 4% of pages were classified at a layperson level, regardless of the Flesch reading ease scores, while the remaining pages were at the level of medical professionals. This indicates that consumer health web pages are not using appropriate language for their target audience.
UR - http://www.scopus.com/inward/record.url?scp=39749155416&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=39749155416&partnerID=8YFLogxK
U2 - 10.1109/HICSS.2007.6
DO - 10.1109/HICSS.2007.6
M3 - Conference contribution
AN - SCOPUS:39749155416
SN - 0769527558
SN - 9780769527550
T3 - Proceedings of the Annual Hawaii International Conference on System Sciences
BT - Proceedings of the 40th Annual Hawaii International Conference on System Sciences 2007, HICSS'07
T2 - 40th Annual Hawaii International Conference on System Sciences 2007, HICSS'07
Y2 - 3 January 2007 through 6 January 2007
ER -