TY - JOUR
T1 - Large-scale automated machine reading discovers new cancer-driving mechanisms
AU - Valenzuela-Escárcega, Marco A.
AU - Babur, Özgün
AU - Hahn-Powell, Gus
AU - Bell, Dane
AU - Hicks, Thomas
AU - Noriega-Atala, Enrique
AU - Wang, Xia
AU - Surdeanu, Mihai
AU - Demir, Emek
AU - Morrison, Clayton T.
N1 - Funding Information:
We thank MITRE for defining and implementing the evaluation described in Section 3.1. We are especially grateful to Tonia Korves and Lynette Hirschman for making these results available before their publication and for the many clarification discussions. We also thank the anonymous reviewers for their insightful comments. Defense Advanced Research Projects Agency (DARPA) Big Mechanism program [ARO W911NF-14-1-0395].
Publisher Copyright:
© The Author(s) 2018. Published by Oxford University Press.
PY - 2018/1/1
Y1 - 2018/1/1
N2 - PubMed, a repository and search engine for biomedical literature, now indexes >1 million articles each year. This exceeds the processing capacity of human domain experts, limiting our ability to truly understand many diseases. We present Reach, a system for automated, large-scale machine reading of biomedical papers that can extract mechanistic descriptions of biological processes with relatively high precision at high throughput. We demonstrate that combining the extracted pathway fragments with existing biological data analysis algorithms that rely on curated models helps identify and explain a large number of previously unidentified mutually exclusive altered signaling pathways in seven different cancer types. This work shows that combining human-curated 'big mechanisms' with extracted 'big data' can lead to a causal, predictive understanding of cellular processes and unlock important downstream applications.
AB - PubMed, a repository and search engine for biomedical literature, now indexes >1 million articles each year. This exceeds the processing capacity of human domain experts, limiting our ability to truly understand many diseases. We present Reach, a system for automated, large-scale machine reading of biomedical papers that can extract mechanistic descriptions of biological processes with relatively high precision at high throughput. We demonstrate that combining the extracted pathway fragments with existing biological data analysis algorithms that rely on curated models helps identify and explain a large number of previously unidentified mutually exclusive altered signaling pathways in seven different cancer types. This work shows that combining human-curated 'big mechanisms' with extracted 'big data' can lead to a causal, predictive understanding of cellular processes and unlock important downstream applications.
UR - http://www.scopus.com/inward/record.url?scp=85054367285&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85054367285&partnerID=8YFLogxK
U2 - 10.1093/database/bay098
DO - 10.1093/database/bay098
M3 - Article
C2 - 30256986
AN - SCOPUS:85054367285
SN - 1758-0463
VL - 2018
JO - Database : the journal of biological databases and curation
JF - Database : the journal of biological databases and curation
IS - 2018
ER -