• 中国精品科技期刊
  • 《中文核心期刊要目总览》收录期刊
  • RCCSE 中国核心期刊(5/114,A+)
  • Scopus收录期刊
  • 美国《化学文摘》(CA)收录期刊
  • WHO 西太平洋地区医学索引(WPRIM)收录期刊
  • 《中国科学引文数据库(CSCD)》核心库期刊 (C)
  • 中国科技核心期刊
  • 中国科技论文统计源期刊
  • 《日本科学技术振兴机构数据库(中国)》(JSTChina)收录期刊
  • 美国《乌利希期刊指南》(UIrichsweb)收录期刊
  • 中华预防医学会系列杂志优秀期刊(2019年)

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

机器学习和Cox比例风险回归模型预警高危型HPV持续感染

丁宏梅 张明亚 胥小琴 张宏秀

丁宏梅, 张明亚, 胥小琴, 张宏秀. 机器学习和Cox比例风险回归模型预警高危型HPV持续感染[J]. 中华疾病控制杂志, 2024, 28(9): 1083-1089. doi: 10.16462/j.cnki.zhjbkz.2024.09.014
引用本文: 丁宏梅, 张明亚, 胥小琴, 张宏秀. 机器学习和Cox比例风险回归模型预警高危型HPV持续感染[J]. 中华疾病控制杂志, 2024, 28(9): 1083-1089. doi: 10.16462/j.cnki.zhjbkz.2024.09.014
DING Hongmei, ZHANG Mingya, XU Xiaoqin, ZHANG Hongxiu. Machine learning and Cox proportional hazards regression model for warning of persistent infection with high-risk HPV type[J]. CHINESE JOURNAL OF DISEASE CONTROL & PREVENTION, 2024, 28(9): 1083-1089. doi: 10.16462/j.cnki.zhjbkz.2024.09.014
Citation: DING Hongmei, ZHANG Mingya, XU Xiaoqin, ZHANG Hongxiu. Machine learning and Cox proportional hazards regression model for warning of persistent infection with high-risk HPV type[J]. CHINESE JOURNAL OF DISEASE CONTROL & PREVENTION, 2024, 28(9): 1083-1089. doi: 10.16462/j.cnki.zhjbkz.2024.09.014

机器学习和Cox比例风险回归模型预警高危型HPV持续感染

doi: 10.16462/j.cnki.zhjbkz.2024.09.014
基金项目: 

江苏省妇幼保健协会科研项目 FYX202345

详细信息
    通讯作者:

    张宏秀,E-mail: hongxiuz@njmu.edu.cn

  • 中图分类号: R737.3

Machine learning and Cox proportional hazards regression model for warning of persistent infection with high-risk HPV type

Funds: 

Scientific Research Project of Jiangsu Maternal and Child Health Care Association FYX202345

More Information
  • 摘要:   目的  建立基于机器学习的人乳头瘤病毒(human papilloma virus, HPV)预测模型,确定与高危型人乳头瘤病毒(high-risk human papilloma virus, HR-HPV)持续感染相关的因素,为早期预警HR-HPV持续感染人群提供帮助。  方法  收集4 407例于2017年9月―2019年9月在泰州市4所卫生机构参与HPV检测,并于2020年9月―2022年9月按照要求参与HPV随访者的临床资料。将队列研究中4 407例研究对象的人口特征作为机器学习模型输入,2次HPV检查结果的变化过程作为输出,建立基于机器学习的预测模型,包括随机森林(random forest, RF)和多层感知机(multilayer perceptron, MLP),预测研究对象的HPV随访结果。采用单因素Cox比例风险回归模型和多因素Cox比例风险回归模型对583例初筛HR-HPV阳性病例进行统计分析,分析HR-HPV持续感染的特征及转归的影响因素。  结果  RF预测模型准确率为84.3%,MLP准确率为80.5%。HR-HPV持续阳性率的前5位病毒型别为HPV58、多重感染、HPV31、HPV33、HPV52。多因素Cox比例风险回归模型研究显示,初中及以下学历人群HR-HPV感染转阴风险是高中及以上学历人群的1.72倍(HR=1.72, 95% CI: 1.03~2.87, P=0.037),未绝经人群HR-HPV感染转阴风险是绝经人群的2.11倍(HR=2.11, 95% CI: 1.10~4.06, P=0.025)。  结论  机器学习和Cox比例风险回归模型可提前预警HR-HPV持续感染人群,对HR-HPV感染女性后续管理和宫颈癌防控有重要的临床价值。
  • 图  1  随机森林和多层感知机训练曲线

    Figure  1.  Random forest and multilayer perceptron training curves

    图  2  同型HR-HPV持续阳性率型别分布

    HR-HPV:高危型人乳头瘤病毒;HPV:人乳头瘤病毒。

    Figure  2.  Distribution of isotype HR-HPV persistent positive genotypes

    HR-HPV: high-risk human papillomavirus; HPV: human papilloma virus.

    表  1  同型持续性HR-HPV阳性组和HR-HPV转阴组单因素Cox回归分析

    Table  1.   Univariate Cox regression analysis of the isotype persistent HR-HPV positive group and HR-HPV turn-negative group

    特征
    Feature
    同型持续性HR-HPV阳性组
    Isotype persistent HR-HPV positive group (n=202)
    HR-HPV转阴组
    The HR-HPV turn-negative group (n=381)
    Wald值
    value
    P
    value
    年龄/岁 Age/years 52(48, 56) 49(44, 52) 9.153 0.002
    初筛 TCT Primary screening TCT 0.726 0.394
       阴性 Negative 126(32.2) 265(67.8)
       阳性 Positive 31(38.3) 50(61.7)
       未查 Unchecked 45(40.5) 66(59.5)
    初筛病理 Primary screening pathology 0.713 0.136
       低级别上皮内瘤变 Low-grade intraepithelial neoplasia 31(40.8) 45(59.2)
       高级别上皮内瘤变 High-grade intraepithelial neoplasia 10(27.8) 26(72.2)
       宫颈癌 Cervical carcinoma 0(0) 3(100.0)
       阴性 Negative 19(46.3) 22(53.7)
       未查 Unchecked 142(33.3) 285(66.7)
    职业 Occupational 6.636 0.036
       第一产业 Primary industry 84(38.5) 134(61.5)
       第二产业 Secondary industry 6(37.5) 10(62.5)
       第三产业 Tertiary industry 57(26.3) 160(73.7)
       未知 Unknown 55(41.7) 77(58.3)
    受教育水平 Educational level 0.154 0.013
       初中及以下 Junior high school education and below 175(34.6) 331(65.4)
       高中及以上 High school education and above 27(35.1) 50(64.9)
    BMI/(kg·m-2) 0.003 0.958
       低体重 Low body weight 3(42.9) 4(57.1)
       正常体重 Normal weight 43(30.7) 97(69.3)
       超重及肥胖 Overweight and obesity 48(38.4) 77(61.6)
       未知 Unknown 108(34.7) 203(65.3)
    初潮年龄/岁 Age at menarche/years 16(15, 17) 15(14, 17) 0.009 0.925
    绝经 Menopause 10.462 0.001
       是 Yes 156(40.5) 229(59.5)
       否 No 46(23.2) 152(76.8)
    妊娠次数 Number of pregnancies 5.202 0.074
       0~1 57(39.6) 87(60.4)
       2 62(26.8) 169(73.2)
       ≥3 71(41.8) 99(58.2)
       未知 Unknown 12(31.6) 26(68.4)
    分娩次数 Number of deliveries 0.260 0.610
       0~1 159(33.9) 310(66.1)
       2 27(38.6) 43(61.4)
       ≥3 4(66.7) 2(33.3)
       未知 Unknown 12(31.6) 26(68.4)
    避孕 Contraception 0.483 0.028
       是 Yes 44(29.1) 107(70.9)
       否 No 158(36.6) 274(63.4)
    初产年龄/岁 Age at first birth/years 24(22, 25) 24(22, 26) 5.945 0.015
    疫苗接种情况 Vaccination 0.001 0.978
       是 Yes 3(75.0) 1(25.0)
       否 No 199(34.4) 380(65.6)
    感染形式 Infection form 5.801 0.016
       多重感染 Multiplicities infection 67(48.6) 71(51.4)
       单一感染 Single infection 135(30.3) 310(69.7)
    有无治疗 With or without treatment 0.720 0.396
       是 Yes 13(31.0) 29(69.0)
       否 No 189(34.9) 352(65.1)
    随访时间/月 Follow-up time/months 34.5(33.6, 34.9) 34.6(34.0, 36.6)
    注:TCT,薄层液基细胞学检查;HR-HPV,高危型人乳头瘤病毒;HPV,人乳头瘤病毒。
    ①以人数(占比/%)或M(P25, P75)表示。
    Note: TCT, thinprep cytologic test; HR-HPV, high-risk human papillomavirus; HPV, human papilloma virus.
    ① Number of people (proportion/%) or M(P25, P75).
    下载: 导出CSV

    表  2  同型持续性HR-HPV阳性组和HR-HPV转阴组多因素Cox回归分析

    Table  2.   Multivariate Cox regression analysis of the isotype persistent HR-HPV positive group and HR-HPV turn-negative group

    变量
    Variable
    β
    value
    sx Wald
    value
    P
    value
    HR值 value
    (95% CI)
    年龄/岁 Age/years 0.018 0.027 0.472 0.492 1.02(0.97~1.07)
    职业 Occupational
       第一产业 Primary industry 1.00
       第二产业 Secondary industry -0.398 0.467 0.725 0.394 0.67(0.27~1.68)
       第三产业 Tertiary industry 0.404 0.242 2.770 0.096 1.50(0.93~2.41)
    受教育水平 Educational level
       高中及以上 High school education and above 1.00
       初中及以下 Junior high school education and below 0.544 0.261 4.348 0.037 1.72(1.03~2.87)
    绝经 Whether menopause
       是 Yes 1.00
       否 No 0.747 0.334 4.997 0.025 2.11(1.10~4.06)
    避孕 Whether contraception
       无 No 1.00
       有 Yes -0.001 0.284 0 0.996 1.00(0.57~1.74)
    初产年龄 Primary age -0.072 0.048 2.261 0.133 0.93(0.85~1.02)
    感染形式 Infection form
       多重感染 Multiplicities infection 1.00
       单一感染 Single infection -0.003 0.283 0 0.991 1.00(0.57~1.74)
    下载: 导出CSV
  • [1] Singh D, Vignat J, Lorenzoni V, et al. Global estimates of incidence and mortality of cervical cancer in 2020: a baseline analysis of the WHO global cervical cancer elimination initiative[J]. Lancet Glob Health, 2023, 11(2): e197-e206. DOI: 10.1016/S2214-109X(22)00501-0.
    [2] 中国子宫颈癌综合防控路径建设专家共识编写组, 中华预防医学会肿瘤预防与控制专业委员会. 中国子宫颈癌综合防控路径建设专家共识[J]. 中国预防医学杂志, 2022, 23(10): 721-726. DOI: 10.16506/j.1009-6639.2022.10.001.

    Expert Consensus Writing Group for Comprehensive Prevention and Control of Cervical Cancer in China, Cancer Prevention and Control Professional Committee of the Chinese Preventive Medicine Association. Expert consensus on the path construction toward comprehensive prevention and control for cervical cancer in China[J]. Chin Prev Med, 2022, 23(10): 721-726. DOI: 10.16506/j.1009-6639.2022.10.001.
    [3] 李文先, 曹秀菁. 中国妇幼健康的进展与成效[J]. 中华疾病控制杂志, 2022, 26(9): 993-995, 1107. DOI: 10.16462/j.cnki.zhjbkz.2022.09.001.

    Li WX, Cao XJ. Progress and achievements in maternal and child health in China[J]. Chin J Dis Control Prev, 2022, 26(9): 993-995, 1107. DOI: 10.16462/j.cnki.zhjbkz.2022.09.001.
    [4] Perkins RB, Guido RS, Castle PE, et al. Erratum: 2019 ASCCP risk-based management consensus guidelines for abnormal cervical cancer screening tests and cancer precursors[J]. J Low Genit Tract Dis, 2021, 25(4): 330-331. DOI: 10.1097/LGT.0000000000000628.
    [5] Karamveer K, Tiwary BK. CarcinoHPVPred: an ensemble of machine learning models for HPV carcinogenicity prediction using genomic data[J]. Carcinogenesis, 2022: bgac079. DOI: 10.1093/carcin/bgac079.
    [6] Usyk M, Zolnik CP, Castle PE, et al. Cervicovaginal microbiome and natural history of HPV in a longitudinal study[J]. PLoS Pathog, 2020, 16(3): e1008376. DOI: 10.1371/journal.ppat.1008376.
    [7] Nayar R, Wilbur DC. The pap test and Bethesda 2014[J]. Cancer Cytopathol, 2015, 123(5): 271-281. DOI: 10.1002/cncy.21521.
    [8] 卢朝辉, 陈杰. WHO女性生殖器官肿瘤学分类(第4版)解读[J]. 中华病理学杂志, 2014, 43(10): 649-650. DOI: 10.3760/cma.j.issn.0529-5807.2014.10.001.

    Lu ZH, Chen J. Interpretation of WHO classification of oncology of female genital organs (4th edition)[J]. Chin J Pathol, 2014, 43(10): 649-650. DOI: 10.3760/cma.j.issn.0529-5807.2014.10.001.
    [9] 国家统计局. 国家统计局关于印发《三次产业划分规定》的通知[EB/OL]. (2003-05-14)[2024-06-13]. https://www.gov.cn/gongbao/content/2003/content_62360.htm.
    [10] 周婕, 吴延莉, 王艺颖, 等. BMI水平及动态变化与高血压、糖尿病、血脂异常共病发生风险的前瞻性队列研究[J]. 中华疾病控制杂志, 2023, 27(12): 1421-1429. DOI: 10.16462/j.cnki.zhjbkz.2023.12.010.

    Zhou J, Wu YL, Wang YY, et al. A prospective cohort study of the BMI level and dynamic changes with the risk of comorbidities of hypertension, diabetes mellitus and dyslipidemia[J]. Chin J Dis Control Prev, 2023, 27(12): 1421-1429. DOI: 10.16462/j.cnki.zhjbkz.2023.12.010.
    [11] Stensen S, Kjaer SK, Jensen SM, et al. Factors associated with type-specific persistence of high-risk human papillomavirus infection: a population-based study[J]. Int J Cancer, 2016, 138(2): 361-368. DOI: 10.1002/ijc.29719.
    [12] Luo Q, Zeng X, Luo H, et al. Epidemiologic characteristics of high-risk HPV and the correlation between multiple infections and cervical lesions[J]. BMC Infect Dis, 2023, 23(1): 667. DOI: 10.1186/s12879-023-08634-w.
    [13] Lin HH, Zhang QR, Kong XG, et al. Machine learning prediction of antiviral-HPV protein interactions for anti-HPV pharmacotherapy[J]. Sci Rep, 2021, 11(1): 24367. DOI: 10.1038/s41598-021-03000-9.
    [14] Cotton SC, Sharp L, Seth R, et al. Lifestyle and socio-demographic factors associated with high-risk HPV infection in UK women[J]. Br J Cancer, 2007, 97(1): 133-139. DOI: 10.1038/sj.bjc.6603822.
    [15] Haddad J, Hasan F, Roumeih AH, et al. The psychosocial burden of anogenital warts on Syrian patients: study of quality of life[J]. Heliyon, 2022, 8(7): e09816. DOI: 10.1016/j.heliyon.2022.e09816.
    [16] Shen Y, Xia J, Li HH, et al. Human papillomavirus infection rate, distribution characteristics, and risk of age in pre- and postmenopausal women[J]. BMC Womens Health, 2021, 21(1): 80. DOI: 10.1186/s12905-021-01217-4.
    [17] 韦晓宁, 徐馨宇, 王少为. 2018—2020年广西某医院就诊的中老年女性人乳头瘤病毒感染情况及分布特点[J]. 中华预防医学杂志, 2022, 56(4): 468-473. DOI: 10.3760/cma.j.cn112150-20210929-00931.

    Wei XN, Xu XY, Wang SW. Infection and distribution characteristics of HPV of middle-aged and elderly women from a certain hospital in Guangxi Zhuang Autonomous Region from 2018 to 2020[J]. Chin J Prev Med, 2022, 56(4): 468-473. DOI: 10.3760/cma.j.cn112150-20210929-00931.
  • 加载中
图(2) / 表(2)
计量
  • 文章访问数:  58
  • HTML全文浏览量:  13
  • PDF下载量:  15
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-08-31
  • 修回日期:  2024-03-16
  • 网络出版日期:  2024-10-24
  • 刊出日期:  2024-09-10

目录

    /

    返回文章
    返回