Abstract
The Gene Ontology (GO), a set of three sub-ontologies, is one of the most popular bio-ontologies used for describing gene product characteristics. GO annotation data containing terms from multiple sub-ontologies and at different levels in the ontologies is an important source of implicit relationships between terms from the three sub-ontologies. Data mining techniques such as association rule mining that are tailored to mine from multiple ontologies at multiple levels of abstraction are required for effective knowledge discovery from GO annotation data. We present a data mining approach, Multi-ontology data mining at All Levels (MOAL) that uses the structure and relationships of the GO to mine multi-ontology multi-level association rules. We introduce two interestingness measures: Multi-ontology Support ( MOSupport) and Multi-ontology Confidence ( MOConfidence) customized to evaluate multi-ontology multi-level association rules. We also describe a variety of post-processing strategies for pruning uninteresting rules. We use publicly available GO annotation data to demonstrate our methods with respect to two applications (1) the discovery of co-annotation suggestions and (2) the discovery of new cross-ontology relationships.
Original language | English (US) |
---|---|
Pages (from-to) | 849-856 |
Number of pages | 8 |
Journal | Journal of Biomedical Informatics |
Volume | 46 |
Issue number | 5 |
DOIs | |
State | Published - Oct 2013 |
Keywords
- Association rule mining
- Data mining
- Gene ontology
- Gene ontology relationships
- Interestingness measures
- Interpro relationships
ASJC Scopus subject areas
- Health Informatics
- Computer Science Applications