Abstract
Comparative biological studies have led to remarkable biomedical discoveries. While genomic science and technologies are advancing rapidly, our ability to precisely specify a phenotype and compare it to related phenotypes of other organisms remains challenging. This study has examined the systematic use of terminology and knowledge based technologies to enable high-throughput comparative phenomics. More specifically, we measured the accuracy of a multi-strategy automated classification method to bridge the phenotype gap between a phenotypic terminology (MGD: Phenoslim) and a broad-coverage clinical terminology (SNOMED CT). Furthermore, we qualitatively evaluate the additional emerging properties of the combined terminological network for comparative biology and discovery science. According to the gold standard (n = 100), the accuracies (precision / recall) of the composite automated methods were 67% / 97% (mapping for identical concepts) and 85% / 98% (classification). Quantitatively, only 2% of the phenotypic concepts were missing from the clinical terminology, however, qualitatively the gap was larger: conceptual scope, granularity and subtle yet significant, homonymy problems were observed. These results suggest that, as observed in other domains, additional strategies are required for combining terminologies.
Original language | English (US) |
---|---|
Pages (from-to) | 202-213 |
Number of pages | 12 |
Journal | Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing |
State | Published - 2004 |
ASJC Scopus subject areas
- Biomedical Engineering
- Computational Theory and Mathematics