Automatic model selection for partially linear models

Xiao Ni, Hao Helen Zhang, Daowen Zhang

Research output: Contribution to journalArticlepeer-review

41 Scopus citations

Abstract

We propose and study a unified procedure for variable selection in partially linear models. A new type of double-penalized least squares is formulated, using the smoothing spline to estimate the nonparametric part and applying a shrinkage penalty on parametric components to achieve model parsimony. Theoretically we show that, with proper choices of the smoothing and regularization parameters, the proposed procedure can be as efficient as the oracle estimator [J. Fan, R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, Journal of American Statistical Association 96 (2001) 1348-1360]. We also study the asymptotic properties of the estimator when the number of parametric effects diverges with the sample size. Frequentist and Bayesian estimates of the covariance and confidence intervals are derived for the estimators. One great advantage of this procedure is its linear mixed model (LMM) representation, which greatly facilitates its implementation by using standard statistical software. Furthermore, the LMM framework enables one to treat the smoothing parameter as a variance component and hence conveniently estimate it together with other regression coefficients. Extensive numerical studies are conducted to demonstrate the effective performance of the proposed procedure.

Original languageEnglish (US)
Pages (from-to)2100-2111
Number of pages12
JournalJournal of Multivariate Analysis
Volume100
Issue number9
DOIs
StatePublished - Oct 2009
Externally publishedYes

Keywords

  • Semiparametric regression
  • Smoothing splines
  • Smoothly clipped absolute deviation
  • Variable selection

ASJC Scopus subject areas

  • Statistics and Probability
  • Numerical Analysis
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'Automatic model selection for partially linear models'. Together they form a unique fingerprint.

Cite this