中国全科医学 ›› 2022, Vol. 25 ›› Issue (35): 4418-4424.DOI: 10.12114/j.issn.1007-9572.2022.0375

• 论著·方法学研究 • 上一篇    下一篇

听力受损风险评估模型的建立与评价研究

李超, 杨永忠, 王慧, 王学林, 孟睿, 司志康, 郑子薇, 陈圆煜, 武建辉*()   

  1. 063210 河北省唐山市,华北理工大学公共卫生学院 河北省煤矿卫生与安全重点实验室
  • 收稿日期:2022-01-05 修回日期:2022-06-25 出版日期:2022-12-15 发布日期:2022-08-18
  • 通讯作者: 武建辉
  • 李超,杨永忠,王慧,等.听力受损风险评估模型的建立与评价研究[J].中国全科医学,2022,25(35):4418-4424,4432.[www.chinagp.net]
    作者贡献:李超负责研究实施、论文撰写、数据整理分析及模型结果可视化处理;杨永忠、王慧、王学林、孟睿、司志康、郑子薇、陈圆煜参与研究实施、数据整理分析、论文修改;武建辉负责最终版本修订,对论文负责。
  • 基金资助:
    国家科技部重点研发项目(2016YFC0900605); 河北省高等学校基本科研业务费项目(JYG2019002)

Development and Evaluation of Three Risk Assessment Models for Hearing Loss: a Comparative Study

LI Chao, YANG Yongzhong, WANG Hui, WANG Xuelin, MENG Rui, SI Zhikang, ZHENG Ziwei, CHEN Yuanyu, WU Jianhui*()   

  1. School of Public Health, North China University of Science and Technology/Hebei Provincial Key Laboratory of Coal Mine Hygiene and Safety, Tangshan 063210, China
  • Received:2022-01-05 Revised:2022-06-25 Published:2022-12-15 Online:2022-08-18
  • Contact: WU Jianhui
  • About author:
    LI C, YANG Y Z, WANG H, et al. Development and evaluation of three risk assessment models for hearing loss: a comparative study[J]. Chinese General Practice, 2022, 25 (35) : 4418-4424, 4432.

摘要: 背景 听力受损在职业人群中具有较高的检出率,而通过早期监测可对其进行有效预防。目前关于该疾病的风险评估研究尚有空缺。 目的 构建石油工人听力受损的风险评估模型,通过对模型的性能进行评价以获得适用于石油工人听力受损的最优评估模型。 方法 本研究采用现况研究,共纳入2018—2019年某石油企业1 423例在华北石油管理局井下医院参加职业健康体检的工人,收集其一般资料、听力学检查、实验室检查结果,采用多因素非条件Logistic回归探讨石油工人听力受损影响因素。结合相关文献综述和专家意见确定模型的输入变量,应用Python构建随机森林、XG Boost和BP神经网络模型,采用受试者工作特征(ROC)曲线评价模型的判别能力,用校准曲线检验模型的校准能力。 结果 不同年龄、性别、家庭月收入、糖尿病史、劳动强度、体育锻炼情况、耳毒性化学毒物暴露情况、睡眠障碍、倒班情况、高温暴露情况的石油工人听力受损检出率比较,差异有统计学意义(P<0.05),随着工龄和累积噪声暴露量的增加,石油工人听力受损检出率增加(P<0.05)。年龄≥50岁、糖尿病、耳毒性化学毒物暴露、失眠、倒班、工龄≥30年、累积噪声暴露量≥90 dB(A)·年是石油工人听力受损的危险因素(P<0.05),家庭月收入≥11 000元、中等劳动强度是听力受损的保护因素(P<0.05)。随机森林、XG Boost和BP神经网络模型判断石油工人听力受损的准确率分别为95.99%、95.22%和88.62%,灵敏度分别为91.43%、89.09%和70.13%,特异度分别为97.69%、97.50%和95.47%,约登指数分别为0.89、0.87和0.66,F1分数分别为0.74、0.73和0.73,ROC曲线下面积(AUC)分别为0.95、0.93和0.83;Brier得分分别为0.04、0.04和0.11,观察-期望比率分别为1.02、1.04和1.21,校准曲线的截距分别为0.029、0.032、0.097。随机森林模型的校准效能最优。 结论 随机森林模型的性能优于XG Boost模型和BP神经网络模型,能够较准确地评估石油工人听力受损的风险。

关键词: 听力损失, 职业病, 随机森林, XG Boost, BP神经网络, 石油工人, 影响因素分析

Abstract:

Background

Hearing loss is highly prevalent in occupational populations, but it could be effectively prevented through early monitoring. There is still a lack of studies on the risk assessment of hearing loss.

Objective

To construct three risk assessment models for hearing loss in oil workers, and evaluate their performance to obtain the optimal one.

Methods

A cross-sectional study was conducted. Participants were 1 423 workers of an oil company who received the occupational health examination from 2018 to 2019 in the Jingxia Hospital of North China Petroleum Administration. Their general demographic data, audiometric test and laboratory test results were collected. Unconditional multivariable Logistic regression was used to explore the factors influencing hearing loss. Python was used to build the random forest, XG Boost, and BP neural network models with factors potentially associated with hearing loss determined based on a literature review and expert opinions incorporated. The discriminative ability of the models were evaluated using the receiver operating characteristic curve (ROC) , and the calibration ability of the model was tested using the calibration curve.

Results

The prevalence of hearing loss changed significantly according to age, gender, monthly household income, history of diabetes, labor intensity, physical exercise, ototoxic chemical exposure, sleep disturbance, shift, and high temperature exposure (P<0.05) . The prevalence of hearing loss rose with the increase in years of work and cumulative noise exposure (P<0.05) . The results of unconditional multivariate Logistic regression analysis showed that 50- years old, diabetes, ototoxic, chemical exposure, insomnia, shift, 30-years of work and cumulative noise exposure≥90 dB (A) ·year were risk factors for hearing loss in oil workers (P<0.05) , monthly household income≥11 000 and moderate labor intensity were protective factors for hearing loss in oil workers (P<0.05) . The AUC of the random forest in assessing hearing loss risk in oil workers was 0.95, with 95.99% accuracy, 91.43% sensitivity, 97.69% specificity, a Youden index of 0.89 and a F1 score of 0.74, the AUC of the XG Boost model in assessing hearing loss risk in oil workers was 0.93, with 95.22% accuracy, 89.09% sensitivity, 97.50% specificity, a Youden index of 0.87 and a F1 score of 0.73, and that of the BP neural network model in assessing hearing loss risk in oil workers was 0.83, with 88.62% accuracy, 70.13% sensitivity, 95.47% specificity, a Youden index of 0.66 and a F1 score of 0.73. The Brier score of the random forest was 0.04, with an observation-to-expectation (O/E) ratio of 1.02 and a calibration-in-the-large of 0.029. The Brier score, O/E ratio and calibration-in-the-large of the XG Boost model were 0.04, 1.04 and 0.032, respectively. The Brier score of the BP neural network model was 0.11, with an O/E ratio of 1.21 and a calibration-in-the-large of 0.097. The calibration efficiency of the random forest model was the best.

Conclusion

The random forest model outperformed the XG Boost model and the BP neural network model, which could be adopted to assess the risk of hearing loss in oil workers more accurately.

Key words: Hearing loss, Occupational diseases, Random forest, XG Boost, Back propagation neural network, Oil workers, Root cause analysis