TY - JOUR
T1 - Data-driven stochastic optimization approaches to determine decision thresholds for risk estimation models
AU - CARE Consortium Investigators
AU - Garcia, Gian Gabriel P.
AU - Lavieri, Mariel S.
AU - Jiang, Ruiwei
AU - McCrea, Michael A.
AU - McAllister, Thomas W.
AU - Broglio, Steven P.
N1 - Publisher Copyright:
© 2020, “IISE”.
PY - 2020/10/2
Y1 - 2020/10/2
N2 - The increasing availability of data has popularized risk estimation models in many industries, especially healthcare. However, properly utilizing these models for accurate diagnosis decisions remains challenging. Our research aims to determine when a risk estimation model provides sufficient evidence to make a positive or negative diagnosis, or if the model is inconclusive. We formulate the Two-Threshold Problem (TTP) as a stochastic program which maximizes sensitivity and specificity while constraining false-positive and false-negative rates. We characterize the optimal solutions to TTP as either two-threshold or one-threshold and show that its optimal solution can be derived from a related linear program (TTP*). We also derive utility-based and multi-class classification frameworks for which our analytical results apply. We solve TTP* using data-driven methods: quantile estimation (TTP*-Q) and distributionally robust optimization (TTP*-DR). Through simulation, we characterize the feasibility, optimality, and computational burden of TTP*-Q and TTP*-DR and compare TTP*-Q to an optimized single threshold. Finally, we apply TTP* to concussion assessment data and find that it achieves greater accuracy at lower misclassification rates compared with traditional approaches. This data-driven framework can provide valuable decision support to clinicians by identifying “easy” cases which can be diagnosed immediately and “hard” cases which may require further evaluation before diagnosing.
AB - The increasing availability of data has popularized risk estimation models in many industries, especially healthcare. However, properly utilizing these models for accurate diagnosis decisions remains challenging. Our research aims to determine when a risk estimation model provides sufficient evidence to make a positive or negative diagnosis, or if the model is inconclusive. We formulate the Two-Threshold Problem (TTP) as a stochastic program which maximizes sensitivity and specificity while constraining false-positive and false-negative rates. We characterize the optimal solutions to TTP as either two-threshold or one-threshold and show that its optimal solution can be derived from a related linear program (TTP*). We also derive utility-based and multi-class classification frameworks for which our analytical results apply. We solve TTP* using data-driven methods: quantile estimation (TTP*-Q) and distributionally robust optimization (TTP*-DR). Through simulation, we characterize the feasibility, optimality, and computational burden of TTP*-Q and TTP*-DR and compare TTP*-Q to an optimized single threshold. Finally, we apply TTP* to concussion assessment data and find that it achieves greater accuracy at lower misclassification rates compared with traditional approaches. This data-driven framework can provide valuable decision support to clinicians by identifying “easy” cases which can be diagnosed immediately and “hard” cases which may require further evaluation before diagnosing.
KW - Data-driven optimization
KW - diagnosis decisions
KW - distributionally robust optimization
KW - quantile estimation
KW - risk estimation models
KW - stochastic programming
UR - http://www.scopus.com/inward/record.url?scp=85081407156&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85081407156&partnerID=8YFLogxK
U2 - 10.1080/24725854.2020.1725254
DO - 10.1080/24725854.2020.1725254
M3 - Article
AN - SCOPUS:85081407156
SN - 2472-5854
VL - 52
SP - 1098
EP - 1121
JO - IISE Transactions
JF - IISE Transactions
IS - 10
ER -