Abstract
In high dimensions, variable selection methods such as the lasso are oftenlimited by excessive variability and rank deficiency of the sample covariancematrix. Covariance sparsity is a natural phenomenon in such high-dimensional applications as microarray analysis, image processing, etc., in which a large number of predictors are independent or weakly correlated. In this paper, we propose thecovariance-thresholded lasso, a new class of regression methods that can utilize covariance sparsity to improve variable selection. We establish theoretical results, under the random design setting, that relate covariance sparsity to variable selection. Data and simulations indicate that our method can be useful in improving variable selection performances.
Original language | English (US) |
---|---|
Pages (from-to) | 625-657 |
Number of pages | 33 |
Journal | Statistica Sinica |
Volume | 21 |
Issue number | 2 |
DOIs | |
State | Published - Apr 2011 |
Externally published | Yes |
Keywords
- Consistency
- Covariance sparsity
- Large p small n
- Random design
- Regression
- Regularization
ASJC Scopus subject areas
- Statistics and Probability
- Statistics, Probability and Uncertainty