Abstract
We develop a latent variable model and an efficient spectral algorithm motivated by the recent emergence of very large data sets of chromatin marks from multiple human cell types. A natural model for chromatin data in one cell type is a Hidden Markov Model (HMM); we model the relationship between multiple cell types by connecting their hidden states by a fixed tree of known structure. The main challenge with learning parameters of such models is that iterative methods such as EM are very slow, while naive spectral methods result in time and space complexity exponential in the number of cell types. We exploit properties of the tree structure of the hidden states to provide spectral algorithms that are more computationally efficient for current biological datasets. We provide sample complexity bounds for our algorithm and evaluate it experimentally on biological data from nine human cell types. Finally, we show that beyond our specific model, some of our algorithmic ideas can be applied to other graphical models.
Original language | English (US) |
---|---|
Pages (from-to) | 469-477 |
Number of pages | 9 |
Journal | Advances in Neural Information Processing Systems |
Volume | 2015-January |
State | Published - 2015 |
Externally published | Yes |
Event | 29th Annual Conference on Neural Information Processing Systems, NIPS 2015 - Montreal, Canada Duration: Dec 7 2015 → Dec 12 2015 |
ASJC Scopus subject areas
- Computer Networks and Communications
- Information Systems
- Signal Processing