## Abstract

Signaling and metabolic pathways, which consist of chains of reactions that produce target molecules from source compounds, are cornerstones of cellular biology. Properly modeling the reaction networks that represent such pathways requires directed hypergraphs, where each molecule or compound maps to a vertex, and each reaction maps to a hyperedge directed from its set of input reactants to its set of output products. Inferring the most likely series of reactions that produces a given set of targets from a given set of sources, where for each reaction its reactants are produced by prior reactions in the series, corresponds to finding a shortest hyperpath in a directed hypergraph, which is NP-complete. We give the first exact algorithm for general shortest hyperpaths that can find provably optimal solutions for large, real-world, reaction networks. In particular, we derive a novel graph-theoretic characterization of hyperpaths, which we leverage in a new integer linear programming formulation of shortest hyperpaths that for the first time handles cycles, and develop a cutting-plane algorithm that can solve this integer linear program to optimality in practice. Through comprehensive experiments over all of the thousands of instances from the standard Reactome and NCI-PID reaction databases, we demonstrate that our cutting-plane algorithm quickly finds an optimal hyperpath—inferring the most likely pathway—with a median running time of under 10 seconds, and a maximum time of less than 30 minutes, even on instances with thousands of reactions. We also explore for the first time how well hyperpaths infer true pathways, and show that shortest hyperpaths accurately recover known pathways, typically with very high precision and recall. Source code implementing our cutting-plane algorithm for shortest hyperpaths is available free for research use in a new tool called Mmunin.

Original language | English (US) |
---|---|

Pages (from-to) | 1198-1225 |

Number of pages | 28 |

Journal | Journal of Computational Biology |

Volume | 30 |

Issue number | 11 |

DOIs | |

State | Published - Nov 1 2023 |

Externally published | Yes |

## Keywords

- cell signaling networks
- combinatorial optimization
- cutting plane algorithms
- directed hypergraphs
- inferring pathways
- integer linear programming
- metabolic networks
- shortest hyperpaths
- systems biology

## ASJC Scopus subject areas

- Modeling and Simulation
- Molecular Biology
- Genetics
- Computational Mathematics
- Computational Theory and Mathematics