Abstract
Gene/protein interactions provide critical information for a thorough understanding of cellular processes. Recently, considerable interest and effort has been focused on the construction and analysis of genome-wide gene networks. The large body of biomedical literature is an important source of gene/protein interaction information. Recent advances in text mining tools have made it possible to automatically extract such documented interactions from free-text literature. In this paper, we propose a comprehensive framework for constructing and analyzing large-scale gene functional networks based on the gene/protein interactions extracted from biomedical literature repositories using text mining tools. Our proposed framework consists of analyses of the network topology, network topology-gene function relationship, and temporal network evolution to distill valuable information embedded in the gene functional interactions in the literature. We demonstrate the application of the proposed framework using a testbed of P53-related PubMed abstracts, which shows that the literature-based P53 networks exhibit small-world and scale-free properties. We also found that high degree genes in the literature-based networks have a high probability of appearing in the manually curated database and genes in the same pathway tend to form local clusters in our literature-based networks. Temporal analysis showed that genes interacting with many other genes tend to be involved in a large number of newly discovered interactions.
Original language | English (US) |
---|---|
Pages (from-to) | 453-464 |
Number of pages | 12 |
Journal | Journal of Biomedical Informatics |
Volume | 40 |
Issue number | 5 |
DOIs | |
State | Published - Oct 2007 |
Keywords
- Gene functional network
- Network analysis
- Text mining
ASJC Scopus subject areas
- Computer Science Applications
- Health Informatics