Measurement error correction in the least absolute shrinkage and selection operator model when validation data are available

Monica M. Vasquez, Chengcheng Hu, Denise J. Roe, Marilyn Halonen, Stefano Guerra

Research output: Contribution to journalArticlepeer-review

11 Scopus citations

Abstract

Measurement of serum biomarkers by multiplex assays may be more variable as compared to single biomarker assays. Measurement error in these data may bias parameter estimates in regression analysis, which could mask true associations of serum biomarkers with an outcome. The Least Absolute Shrinkage and Selection Operator (LASSO) can be used for variable selection in these high-dimensional data. Furthermore, when the distribution of measurement error is assumed to be known or estimated with replication data, a simple measurement error correction method can be applied to the LASSO method. However, in practice the distribution of the measurement error is unknown and is expensive to estimate through replication both in monetary cost and need for greater amount of sample which is often limited in quantity. We adapt an existing bias correction approach by estimating the measurement error using validation data in which a subset of serum biomarkers are re-measured on a random subset of the study sample. We evaluate this method using simulated data and data from the Tucson Epidemiological Study of Airway Obstructive Disease (TESAOD). We show that the bias in parameter estimation is reduced and variable selection is improved.

Original languageEnglish (US)
Pages (from-to)670-680
Number of pages11
JournalStatistical Methods in Medical Research
Volume28
Issue number3
DOIs
StatePublished - Mar 1 2019

Keywords

  • LASSO
  • bias correction
  • biomarkers
  • high-dimensional
  • measurement error

ASJC Scopus subject areas

  • Epidemiology
  • Statistics and Probability
  • Health Information Management

Fingerprint

Dive into the research topics of 'Measurement error correction in the least absolute shrinkage and selection operator model when validation data are available'. Together they form a unique fingerprint.

Cite this