Variable selection and model building via likelihood basis pursuit

Hao Helen Zhang, Grace Wahba, Yi Lin, Meta Voelker, Michael Ferris, Ronald Klein, Barbara Klein

Research output: Contribution to journalArticlepeer-review

44 Scopus citations


This article presents a nonparametric penalized likelihood approach for variable selection and model building, called likelihood basis pursuit (LBP). In the setting of a tensor product reproducing kernel Hilbert space, we decompose the log-likelihood into the sum of different functional components such as main effects and interactions, with each component represented by appropriate basis functions. Basis functions are chosen to be compatible with variable selection and model building in the context of a smoothing spline ANOVA model. Basis pursuit is applied to obtain the optimal decomposition in terms of having the smallest l 1 norm on the coefficients. We use the functional L 1 norm to measure the importance of each component and determine the "threshold" value by a sequential Monte Carlo bootstrap test algorithm. As a generalized LASSO-type method, LBP produces shrinkage estimates for the coefficients, which greatly facilitates the variable selection process and provides highly interpretable multivariate functional estimates at the same time. To choose the regularization parameters appearing in the LBP models, generalized approximate cross-validation (GACV) is derived as a tuning criterion. To make GACV widely applicable to large datasets, its randomized version is proposed as well. A technique "slice modeling" is used to solve the optimization problem and makes the computation more efficient. LBP has great potential for a wide range of research and application areas such as medical studies, and in this article we apply it to two large ongoing epidemiologic studies, the Wisconsin Epidemiologic Study of Diabetic Retinopathy (WESDR) and the Beaver Dam Eye Study (BDES).

Original languageEnglish (US)
Pages (from-to)659-672
Number of pages14
JournalJournal of the American Statistical Association
Issue number467
StatePublished - Sep 2004


  • Generalized approximate cross-validation
  • Monte Carlo bootstrap test
  • Nonparametric variable selection
  • Slice modeling
  • Smoothing spline ANOVA

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty


Dive into the research topics of 'Variable selection and model building via likelihood basis pursuit'. Together they form a unique fingerprint.

Cite this