Abstract
DNA Copy number variation (CNV) has recently gained considerable interest as a source of genetic variation that likely influences phenotypic differences. Many statistical and computational methods have been proposed and applied to detect CNVs based on data that generated by genome analysis platforms. However, most algorithms are computationally intensive with complexity at least O(n 2), where n is the number of probes in the experiments. Moreover, the theoretical properties of those existing methods are not well understood. A faster and better characterized algorithm is desirable for the ultra high throughput data. In this study, we propose the Screening and Ranking algorithm (SaRa) which can detect CNVs fast and accurately with complexity down to O(n). In addition, we characterize theoretical properties and present numerical analysis for our algorithm.
Original language | English (US) |
---|---|
Pages (from-to) | 1306-1326 |
Number of pages | 21 |
Journal | Annals of Applied Statistics |
Volume | 6 |
Issue number | 3 |
DOIs | |
State | Published - Sep 2012 |
Keywords
- Change-point detection
- Copy number variations
- High dimensional data
- Screening and ranking algorithm
ASJC Scopus subject areas
- Statistics and Probability
- Modeling and Simulation
- Statistics, Probability and Uncertainty