TY - GEN
T1 - SHAP-Prioritised Machine Learning for Diagnostic-Grade Prediction of Lung Function
AU - Pitts, Oliver
AU - Siddiqui, Salman
AU - Shah, Anand
AU - Bell, Alex
AU - Brightling, Christopher
AU - Singh, David
AU - Kocks, Janwillem
AU - Fabbri, Leonardo
AU - Papi, Alberto
AU - Rabe, Klaus
AU - Van Den Berge, Maarten
AU - Kraft, Monica
AU - Arcucci, Rossella
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - Automated machine learning (ML) can streamline the characterisation and management of chronic airway conditions. With the advent of quantitative CT (qCT) imaging allowing precise extraction of structural features from scans, assessment of airway obstruction levels could be automated to compliment traditional testing. This “feature known” approach has the added potential benefit of identifying key structure-function relationships through explainability measures. We therefore aimed to develop inverse models to estimate spirometry parameters from high-dimensional quantitative data using these structural metrics as constraints. With the ATLANTIS (NCT02123667) dataset, this paper experiments with a selection of ML methods, specifically k-nearest neighbours (kNN), random forest (RF) and support vector machine (SVM), to predict spirometry values (Forced Expiratory Volume (FEV1), Forced Vital Capacity (FVC) and FEV1/FVC). The dynamic ratio FEV1/FVC was predicted better by all models than FEV1 or FVC. Results show effective counteraction to high-dimensionality through iterative feature refinement guided by SHapley Additive exPlanations (SHAP), and to limited training data through dynamic Gaussian noise (DGN). Diagnostic-grade prediction accuracy was achieved with DGN SHAP sequential feature selection (SFS)-kNN at 1.64% MRE with 37/76 features. A selection of typical variables including expiratory tissue density and lung volume, vasculature and airway geometries were seen to be important for prediction. This approach therefore can not only predict pulmonary function, but also extract useful structural information in a dynamic airway system through linking back to personalised abnormalities.
AB - Automated machine learning (ML) can streamline the characterisation and management of chronic airway conditions. With the advent of quantitative CT (qCT) imaging allowing precise extraction of structural features from scans, assessment of airway obstruction levels could be automated to compliment traditional testing. This “feature known” approach has the added potential benefit of identifying key structure-function relationships through explainability measures. We therefore aimed to develop inverse models to estimate spirometry parameters from high-dimensional quantitative data using these structural metrics as constraints. With the ATLANTIS (NCT02123667) dataset, this paper experiments with a selection of ML methods, specifically k-nearest neighbours (kNN), random forest (RF) and support vector machine (SVM), to predict spirometry values (Forced Expiratory Volume (FEV1), Forced Vital Capacity (FVC) and FEV1/FVC). The dynamic ratio FEV1/FVC was predicted better by all models than FEV1 or FVC. Results show effective counteraction to high-dimensionality through iterative feature refinement guided by SHapley Additive exPlanations (SHAP), and to limited training data through dynamic Gaussian noise (DGN). Diagnostic-grade prediction accuracy was achieved with DGN SHAP sequential feature selection (SFS)-kNN at 1.64% MRE with 37/76 features. A selection of typical variables including expiratory tissue density and lung volume, vasculature and airway geometries were seen to be important for prediction. This approach therefore can not only predict pulmonary function, but also extract useful structural information in a dynamic airway system through linking back to personalised abnormalities.
KW - Airway Diseases
KW - Machine Learning
KW - Quantitative CT
KW - SHAP
UR - https://www.scopus.com/pages/publications/105010828384
UR - https://www.scopus.com/pages/publications/105010828384#tab=citedBy
U2 - 10.1007/978-3-031-97567-7_7
DO - 10.1007/978-3-031-97567-7_7
M3 - Conference contribution
AN - SCOPUS:105010828384
SN - 9783031975660
T3 - Lecture Notes in Computer Science
SP - 73
EP - 85
BT - Computational Science – ICCS 2025 Workshops - 25th International Conference, 2025, Proceedings
A2 - Paszynski, Maciej
A2 - Barnard, Amanda S.
A2 - Zhang, Yongjie Jessica
PB - Springer Science and Business Media Deutschland GmbH
T2 - Workshops on Computational Science, which were co-organized with the 25th International Conference on Computational Science, ICCS 2025
Y2 - 7 July 2025 through 9 July 2025
ER -