超声影像组学与临床的可解释性机器学习预测甲状腺乳头状癌复发
Explainable Machine Learning Based on Ultrasound Radiomics and Clinical Features for Predicting Recurrence in Papillary Thyroid Carcinoma
未经授权,不得转载,摘编本刊文章,不得使用本刊的版式设计。
申明:本刊刊出的所有文章不代表本刊主办单位和编委会的观点。
-
摘要:目的 基于术前超声影像组学特征与关键临床病理参数,构建机器学习模型预测甲状腺乳头状癌(papillary thyroid carcinoma,PTC)复发风险。方法 回顾性纳入2017—2019年在广东省人民医院确诊并接受根治性手术、且完成术后5年超声随访的PTC患者共152例,其中复发34例、未复发118例。收集术前临床病理资料及超声图像,提取超声影像组学特征并与关键临床病理变量融合。采用分层随机抽样按7∶3划分训练集(n=106)与测试集(n=46),分别训练L2正则化Logistic回归、随机森林(random forest,RF)、支持向量机(support vector machine,SVM,RBF核)和梯度提升树(gradient boosting decision tree,GBDT)模型,以受试者工作特征曲线下面积(area under the receiver operating characteristic curve,AUC)等指标评价性能,并采用SHapley可加性解释(SHapley Additive exPlanations,SHAP)解释最优模型。结果 最佳模型L2正则化Logistic回归模型在测试集AUC为0.89,敏感性为0.80,特异性为0.81。SHAP显示wavelet-L_firstorder_Kurtosis、中央区淋巴结转移比率(central lymph node metastasis ratio,CLNR)、手术方式、BRAF V600E突变与肿瘤最大径是预测PTC复发的关键变量。结论 L2 正则化 Logistic 回归模型展示了优异的预测性能与良好的临床可解释性,具有较强的推广潜力。Abstract:Objective To develop a machine learning model for predicting recurrence risk in papillary thyroid carcinoma (PTC) based on preoperative ultrasound radiomics features combined with key clinicopathological parameters.Methods A total of 152 patients with PTC who were diagnosed at Guangdong Provincial People’s Hospital between 2017 and 2019, underwent curative surgery, and completed a 5-year postoperative ultrasound follow-up were retrospectively included, comprising 34 patients with recurrence and 118 without recurrence. Preoperative clinicopathological data and ultrasound images were collected, and ultrasound radiomics features were extracted and integrated with key clinicopathological variables. Using stratified random sampling, patients were divided into a training set (n=106) and a testing set (n=46) at a ratio of 7∶3. L2-regularized logistic regression, random forest (RF), support vector machine with a radial basis function kernel (SVM-RBF), and gradient boosting decision tree (GBDT) models were trained. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC) and other metrics, and SHapley Additive exPlanations (SHAP) was applied to interpret the optimal model.Results The logistic regression model achieved the best performance on the testing set, with an AUC of 0.89, sensitivity of 0.80, and specificity of 0.81. SHAP analysis revealed that key predictors of recurrence included wavelet-L_firstorder_Kurtosis, central lymph node metastasis ratio (CLNR), surgical procedure, BRAF V600E mutation status, and maximum tumor diameter.Conclusion The logistic regression model demonstrated excellent predictive performance and strong clinical interpretability, suggesting its potential for widespread application in recurrence risk stratification in PTC patients.
下载: