TY - GEN
T1 - Computing Shortest Hyperpaths for Pathway Inference in Cellular Reaction Networks
AU - Krieger, Spencer
AU - Kececioglu, John
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2023
Y1 - 2023
N2 - Signaling and metabolic pathways, which consist of a series of reactions producing target molecules from source compounds, are cornerstones of cellular biology. The cellular reaction networks containing such pathways can be precisely modeled by directed hypergraphs, where each reaction corresponds to a hyperedge, directed from its set of reactants to its set of products. Given such a network represented by a directed hypergraph, inferring the most likely set of reactions that produce a given target from a given set of sources corresponds to finding a shortest hyperpath, which is NP-complete. The best methods currently available for shortest hyperpaths either offer no guarantee of optimality, or exclude hyperpaths containing cycles even though cycles are abundant in real biological pathways. We derive a novel graph-theoretic characterization of hyperpaths, leveraged in a new formulation of the general shortest hyperpath problem as an integer linear program that for the first time handles hyperpaths containing cycles, and present a novel cutting-plane algorithm that can solve this integer program to optimality in practice. This represents a major advance over the best prior exact algorithm, which was limited to acyclic hyperpaths (and hence fails to find a solution for the many biological instances where all hyperpaths are in fact cyclic). In comprehensive experiments over thousands of instances from the standard NCI-PID and Reactome databases, we demonstrate that our cutting-plane algorithm quickly finds an optimal hyperpath, with a median running-time of under ten seconds and a maximum time of around thirty minutes, even on large instances with many thousands of reactions. Source code implementing our cutting-plane algorithm for shortest hyperpaths in a new tool called Mmunin is available free for research use at http://mmunin.cs.arizona.edu.
AB - Signaling and metabolic pathways, which consist of a series of reactions producing target molecules from source compounds, are cornerstones of cellular biology. The cellular reaction networks containing such pathways can be precisely modeled by directed hypergraphs, where each reaction corresponds to a hyperedge, directed from its set of reactants to its set of products. Given such a network represented by a directed hypergraph, inferring the most likely set of reactions that produce a given target from a given set of sources corresponds to finding a shortest hyperpath, which is NP-complete. The best methods currently available for shortest hyperpaths either offer no guarantee of optimality, or exclude hyperpaths containing cycles even though cycles are abundant in real biological pathways. We derive a novel graph-theoretic characterization of hyperpaths, leveraged in a new formulation of the general shortest hyperpath problem as an integer linear program that for the first time handles hyperpaths containing cycles, and present a novel cutting-plane algorithm that can solve this integer program to optimality in practice. This represents a major advance over the best prior exact algorithm, which was limited to acyclic hyperpaths (and hence fails to find a solution for the many biological instances where all hyperpaths are in fact cyclic). In comprehensive experiments over thousands of instances from the standard NCI-PID and Reactome databases, we demonstrate that our cutting-plane algorithm quickly finds an optimal hyperpath, with a median running-time of under ten seconds and a maximum time of around thirty minutes, even on large instances with many thousands of reactions. Source code implementing our cutting-plane algorithm for shortest hyperpaths in a new tool called Mmunin is available free for research use at http://mmunin.cs.arizona.edu.
UR - http://www.scopus.com/inward/record.url?scp=85152584696&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85152584696&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-29119-7_10
DO - 10.1007/978-3-031-29119-7_10
M3 - Conference contribution
AN - SCOPUS:85152584696
SN - 9783031291180
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 155
EP - 173
BT - Research in Computational Molecular Biology - 27th Annual International Conference, RECOMB 2023, Proceedings
A2 - Tang, Haixu
PB - Springer Science and Business Media Deutschland GmbH
T2 - 27th International Conference on Research in Computational Molecular Biology, RECOMB 2023
Y2 - 16 April 2023 through 19 April 2023
ER -