TY - JOUR
T1 - Random survival forests using linked data to measure illness burden among individuals before or after a cancer diagnosis
T2 - Development and internal validation of the SEER-CAHPS illness burden index
AU - Lines, Lisa M.
AU - Cohen, Julia
AU - Kirschner, Justin
AU - Halpern, Michael T.
AU - Kent, Erin E.
AU - Mollica, Michelle A.
AU - Smith, Ashley Wilder
N1 - Publisher Copyright:
© 2020 Elsevier B.V.
PY - 2021/1
Y1 - 2021/1
N2 - Purpose: To develop and internally validate an illness burden index among Medicare beneficiaries before or after a cancer diagnosis. Methods: Data source: SEER-CAHPS, linking Surveillance, Epidemiology, and End Results (SEER) cancer registry, Medicare enrollment and claims, and Medicare Consumer Assessment of Healthcare Providers and Systems (Medicare CAHPS) survey data providing self-reported sociodemographic, health, and functional status information. To generate a score for everyone in the dataset, we tabulated 4 groups within each annual subsample (2007–2013): 1) Medicare Advantage (MA) beneficiaries or 2) Medicare fee-for-service (FFS) beneficiaries, surveyed before cancer diagnosis; 3) MA beneficiaries or 4) Medicare FFS beneficiaries surveyed after diagnosis. Random survival forests (RSFs) predicted 12-month all-cause mortality and drew predictor variables (mean per subsample = 44) from 8 domains: sociodemographic, cancer-specific, health status, chronic conditions, healthcare utilization, activity limitations, proxy, and location-based factors. Roughly two-thirds of the sample was held out for algorithm training. Error rates based on the validation (“out-of-bag,” OOB) samples reflected the correctly classified percentage. Illness burden scores represented predicted cumulative mortality hazard. Results: The sample included 116,735 Medicare beneficiaries with cancer, of whom 73 % were surveyed after their cancer diagnosis; overall mean mortality rate in the 12 months after survey response was 6%. SEER-CAHPS Illness Burden Index (SCIBI) scores were positively skewed (median range: 0.29 [MA, pre-diagnosis] to 2.85 [FFS, post-diagnosis]; mean range: 2.08 [MA, pre-diagnosis] to 4.88 [MA, post-diagnosis]). The highest decile of the distribution had a 51 % mortality rate (range: 29–71 %); the bottom decile had a 1% mortality rate (range: 0–2 %). The error rate was 20 % overall (range: 9% [among FFS enrollees surveyed after diagnosis] to 36 % [MA enrollees surveyed before diagnosis]). Conclusions: This new morbidity measure for Medicare beneficiaries with cancer may be useful to future SEER-CAHPS users who wish to adjust for comorbidity.
AB - Purpose: To develop and internally validate an illness burden index among Medicare beneficiaries before or after a cancer diagnosis. Methods: Data source: SEER-CAHPS, linking Surveillance, Epidemiology, and End Results (SEER) cancer registry, Medicare enrollment and claims, and Medicare Consumer Assessment of Healthcare Providers and Systems (Medicare CAHPS) survey data providing self-reported sociodemographic, health, and functional status information. To generate a score for everyone in the dataset, we tabulated 4 groups within each annual subsample (2007–2013): 1) Medicare Advantage (MA) beneficiaries or 2) Medicare fee-for-service (FFS) beneficiaries, surveyed before cancer diagnosis; 3) MA beneficiaries or 4) Medicare FFS beneficiaries surveyed after diagnosis. Random survival forests (RSFs) predicted 12-month all-cause mortality and drew predictor variables (mean per subsample = 44) from 8 domains: sociodemographic, cancer-specific, health status, chronic conditions, healthcare utilization, activity limitations, proxy, and location-based factors. Roughly two-thirds of the sample was held out for algorithm training. Error rates based on the validation (“out-of-bag,” OOB) samples reflected the correctly classified percentage. Illness burden scores represented predicted cumulative mortality hazard. Results: The sample included 116,735 Medicare beneficiaries with cancer, of whom 73 % were surveyed after their cancer diagnosis; overall mean mortality rate in the 12 months after survey response was 6%. SEER-CAHPS Illness Burden Index (SCIBI) scores were positively skewed (median range: 0.29 [MA, pre-diagnosis] to 2.85 [FFS, post-diagnosis]; mean range: 2.08 [MA, pre-diagnosis] to 4.88 [MA, post-diagnosis]). The highest decile of the distribution had a 51 % mortality rate (range: 29–71 %); the bottom decile had a 1% mortality rate (range: 0–2 %). The error rate was 20 % overall (range: 9% [among FFS enrollees surveyed after diagnosis] to 36 % [MA enrollees surveyed before diagnosis]). Conclusions: This new morbidity measure for Medicare beneficiaries with cancer may be useful to future SEER-CAHPS users who wish to adjust for comorbidity.
KW - Cancer registry data
KW - Claims data
KW - Morbidity
KW - Mortality
KW - Random survival forests
KW - Survey data
UR - http://www.scopus.com/inward/record.url?scp=85096237424&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85096237424&partnerID=8YFLogxK
U2 - 10.1016/j.ijmedinf.2020.104305
DO - 10.1016/j.ijmedinf.2020.104305
M3 - Article
C2 - 33188949
AN - SCOPUS:85096237424
SN - 1386-5056
VL - 145
JO - International Journal of Medical Informatics
JF - International Journal of Medical Informatics
M1 - 104305
ER -