TY - JOUR
T1 - Cox regression analysis with missing covariates via nonparametric multiple imputation
AU - Hsu, Chiu Hsieh
AU - Yu, Mandi
N1 - Funding Information:
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Dr. Chiu-Hsieh Hsu’s research was partially supported by the National Cancer Institute grant P30 CA 023074.
Publisher Copyright:
© The Author(s) 2018.
PY - 2019/6/1
Y1 - 2019/6/1
N2 - We consider the situation of estimating Cox regression in which some covariates are subject to missing, and there exists additional information (including observed event time, censoring indicator and fully observed covariates) which may be predictive of the missing covariates. We propose to use two working regression models: one for predicting the missing covariates and the other for predicting the missing probabilities. For each missing covariate observation, these two working models are used to define a nearest neighbor imputing set. This set is then used to non-parametrically impute covariate values for the missing observation. Upon the completion of imputation, Cox regression is performed on the multiply imputed datasets to estimate the regression coefficients. In a simulation study, we compare the nonparametric multiple imputation approach with the augmented inverse probability weighted (AIPW) method, which directly incorporates the two working models into estimation of Cox regression, and the predictive mean matching imputation (PMM) method. We show that all approaches can reduce bias due to non-ignorable missing mechanism. The proposed nonparametric imputation method is robust to mis-specification of either one of the two working models and robust to mis-specification of the link function of the two working models. In contrast, the PMM method is sensitive to misspecification of the covariates included in imputation. The AIPW method is sensitive to the selection probability. We apply the approaches to a breast cancer dataset from Surveillance, Epidemiology and End Results (SEER) Program.
AB - We consider the situation of estimating Cox regression in which some covariates are subject to missing, and there exists additional information (including observed event time, censoring indicator and fully observed covariates) which may be predictive of the missing covariates. We propose to use two working regression models: one for predicting the missing covariates and the other for predicting the missing probabilities. For each missing covariate observation, these two working models are used to define a nearest neighbor imputing set. This set is then used to non-parametrically impute covariate values for the missing observation. Upon the completion of imputation, Cox regression is performed on the multiply imputed datasets to estimate the regression coefficients. In a simulation study, we compare the nonparametric multiple imputation approach with the augmented inverse probability weighted (AIPW) method, which directly incorporates the two working models into estimation of Cox regression, and the predictive mean matching imputation (PMM) method. We show that all approaches can reduce bias due to non-ignorable missing mechanism. The proposed nonparametric imputation method is robust to mis-specification of either one of the two working models and robust to mis-specification of the link function of the two working models. In contrast, the PMM method is sensitive to misspecification of the covariates included in imputation. The AIPW method is sensitive to the selection probability. We apply the approaches to a breast cancer dataset from Surveillance, Epidemiology and End Results (SEER) Program.
KW - Augmented inverse probability weighted method
KW - Cox regression
KW - missing covariates
KW - multiple imputation
KW - predictive mean matching
UR - http://www.scopus.com/inward/record.url?scp=85067348774&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85067348774&partnerID=8YFLogxK
U2 - 10.1177/0962280218772592
DO - 10.1177/0962280218772592
M3 - Article
C2 - 29717943
AN - SCOPUS:85067348774
SN - 0962-2802
VL - 28
SP - 1676
EP - 1688
JO - Statistical Methods in Medical Research
JF - Statistical Methods in Medical Research
IS - 6
ER -