Advanced Search

CN 34-1304/RISSN 1674-3679

ZHANG Ruimin, WANG Keke, LI Jinbo, CHEN Zhuanzhuan, YANG Hailan, WU Weiwei, FENG Yongliang, WANG Suping, ZHANG Xinri. Risk prediction of small for gestational age birth based on machine learning algorithms[J]. CHINESE JOURNAL OF DISEASE CONTROL & PREVENTION, 2023, 27(8): 922-927. doi: 10.16462/j.cnki.zhjbkz.2023.08.009
Citation: ZHANG Ruimin, WANG Keke, LI Jinbo, CHEN Zhuanzhuan, YANG Hailan, WU Weiwei, FENG Yongliang, WANG Suping, ZHANG Xinri. Risk prediction of small for gestational age birth based on machine learning algorithms[J]. CHINESE JOURNAL OF DISEASE CONTROL & PREVENTION, 2023, 27(8): 922-927. doi: 10.16462/j.cnki.zhjbkz.2023.08.009

Risk prediction of small for gestational age birth based on machine learning algorithms

doi: 10.16462/j.cnki.zhjbkz.2023.08.009
Funds:

Youth Scientific Research Project of Fundamental Research Program in Shanxi Province 20210302124581

More Information
  • Corresponding author: ZHANG Xinri, E-mail: ykdzxr61@163.com
  • Received Date: 2022-12-16
  • Rev Recd Date: 2023-01-26
  • Available Online: 2023-09-02
  • Publish Date: 2023-08-10
  •   Objective  To evaluate the performance of risk prediction of five machine learning models and traditional logistic regression models, such as, extreme gradient boosting (XGBoost), support vector machine (SVM), and Naive Bayes, aimed at small for gestational age (SGA).  Methods  A total of 9 972 women who gave birth in the First Hospital of Shanxi Medical University from March 2012 to September 2016 were selected as the research subjects in this study. Their data was collected from the hospital information system and through questionnaire surveys. Based on delivery outcomes, each case was put into one of two categories: an SGA group (n=1 124) and a non-SGA group (n=8 848), with the trial set and test set according to the ratio of 7.50∶2.50. Multivariate logistic regression model were used to screen the influencing factors. To establish predictive models, XGBoost, SVM, Naive Bayes, gradient boosting decision tree (GBDT) and k-nearest neighbor (KNN) algorithms were used. Furthermore, their predictive performance was measured with metrics such as the area under the curve (AUC), accuracy, and precision.  Results  Logistic regression analysis showed that gestational hypertension and eclampsia were among the seven variables related to the occurrence of SGA. By incorporating such variables into the machine learning algorithms and traditional logistic regression, the SVM model achieved the best performance with the highest AUC of 0.72 and 71% accuracy. Comparatively, compared to the SVM model, the logistic regression-based model was under performing, with an AUC of 0.71 and 66% accuracy.  Conclusions  Machine learning models, especially SVM, are capable of more accurately evaluating the risk of the occurrence of SGA in Shanxi Province, and can provide a reference for the primary prevention of SGA.
  • [1]
    Physical status: the use and interpretation of anthropometry. Report of a WHO Expert Committee[J]. World Health Organ Tech Rep Ser, 1995, 854: 1-452.
    [2]
    Lee AC, Katz J, Blencowe H, et al. National and regional estimates of term and preterm babies born small for gestational age in 138 low-income and middle-income countries in 2010[J]. Lancet Glob Health, 2013, 1(1): e26-e36. DOI: 10.1016/S2214-109X(13)70006-8.
    [3]
    沈忠周, 王雅文, 马帅, 等. 新生儿早产、低出生体重及小于胎龄的危险因素[J]. 中华流行病学杂志, 2019, 40(9): 1125-1129. DOI: 10.3760/cma.j.issn.0254-6450.2019.09.020.

    Shen ZZ, Wang YW, Ma S, et al. Risk factors for preterm birth, low birth weight and small for gestational age: a prospective cohort study[J]. Chin J Epidemiol, 2019, 40(9): 1125-1129. DOI: 10.3760/cma.j.issn.0254-6450.2019.09.020.
    [4]
    von Beckerath AK, Kollmann M, Rotky-Fast C, et al. Perinatal complications and long-term neurodevelopmental outcome of infants with intrauterine growth restriction[J]. Am J Obstet Gynecol, 2013, 208(2): 130. e1-130. e6. DOI: 10.1016/j.ajog.2012.11.014.
    [5]
    Eves R, Mendonça M, Bartmann P, et al. Small for gestational age-cognitive performance from infancy to adulthood: an observational study[J]. BJOG, 2020, 127(13): 1598-1606. DOI: 10.1111/1471-0528.16341.
    [6]
    Lindqvist PG, Molin J. Does antenatal identification of small-for-gestational age fetuses significantly improve their outcome?[J]. Ultrasound Obstet Gynecol, 2005, 25(3): 258-264. DOI: 10.1002/uog.1806.
    [7]
    D'Ascenzo F, De Filippo O, Gallone G, et al. Machine learning-based prediction of adverse events following an acute coronary syndrome (PRAISE): a modelling study of pooled datasets[J]. Lancet, 2021, 397(10270): 199-207. DOI: 10.1016/S0140-6736(20)32519-8.
    [8]
    欧阳平, 李小溪, 冷芬, 等. 机器学习算法在体检人群糖尿病风险预测中的应用[J]. 中华疾病控制杂志, 2021, 25(7): 849-853, 868. DOI: 10.16462/j.cnki.zhjbkz.2021.07.020.

    Ouyang P, Li XX, Leng F, et al. Application of machine learning algorithm in diabetes risk prediction of physical examination population[J]. Chin J Dis Control Prev, 2021, 25(7): 849-853, 868. DOI: 10.16462/j.cnki.zhjbkz.2021.07.020.
    [9]
    朱丽, 张蓉, 张淑莲, 等. 中国不同胎龄新生儿出生体重曲线研制[J]. 中华儿科杂志, 2015, 53(2): 97-103. DOI: 10.3760/cma.j.issn.0578-1310.2015.02.007.

    Zhu L, Zhang R, Zhang SL, et al. Chinese neonatal birth weight curve for different gestational age[J]. Chin J Pediatr, 2015, 53(2): 97-103. DOI: 10.3760/cma.j.issn.0578-1310.2015.02.007.
    [10]
    中国肥胖问题工作组数据汇总分析协作组. 我国成人体重指数和腰围对相关疾病危险因素异常的预测价值: 适宜体重指数和腰围切点的研究[J]. 中华流行病学杂志, 2002, 23(1): 5-10. DOI: 10.3760/j.issn:0254-6450.2002.01.003.

    Coorperative Meta-analysis Group of China Obesity Task Force. Predictive values of body mass index and waist circumference to risk factors of related diseases in Chinese adult population[J]. Chin J Epidemiol, 2002, 23(1): 5-10. DOI: 10.3760/j.issn:0254-6450.2002.01.003.
    [11]
    Li Y, Guo H, Xiao L, et al. Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data[J]. Knowledge-Based Systems, 2016, 94: 88-104. DOI: 10.1016/j.knosys.2015.11.013.
    [12]
    Royal College of Obstetricians and Gynaecologists. The investigation and management of the small for gestational age fetus. Green-top Guideline No. 31[EB/OL]. (2013-03-22)[2023-02-16]. https://www.rcog.org.uk/en/guidelines-research-services/guidelines/gtg31/.
    [13]
    Papastefanou I, Wright D, Lolos M, et al. Competing-risks model for prediction of small-for-gestational-age neonate from maternal characteristics, serum pregnancy-associated plasma protein-A and placental growth factor at 11-13 weeks' gestation[J]. Ultrasound Obstet Gynecol, 2021, 57(3): 392-400. DOI: 10.1002/uog.23118.
    [14]
    Gürgen F, Zengin Z, Varol F. Intrauterine growth restriction (IUGR) risk decision based on support vector machines[J]. Expert Syst Appl, 2012, 39(3): 2872-2876. DOI: 10.1016/j.eswa.2011.08.147.
    [15]
    Gardosi J, Madurasinghe V, Williams M, et al. Maternal and fetal risk factors for stillbirth: population based study[J]. BMJ, 2013, 346: f108. DOI: 10.1136/bmj.f108.
    [16]
    Gurung S, Tong HH, Bryce E, et al. A systematic review on estimating population attributable fraction for risk factors for small-for-gestational-age births in 81 low-and middle-income countries[J]. J Glob Health, 2022, 12: 04024. DOI: 10.7189/jogh.12.04024.
    [17]
    Parihar S, Singh S. Perinatal outcomes and intrahepatic cholestasis of pregnancy: a prospective study[J]. Int J Reprod Contracept Obstet Gynecol, 2019, 8(3): 1177-1182. DOI: 10.18203/2320-1770.ijrcog20190901.
    [18]
    Natarajan V, Singh P, Vigneshwar NKV, et al. Maternal and placental risk factors for small gestational age and fetal malnutrition[J]. Curr Pediatr Rev, 2023, 19(2): 187-196. DOI: 10.2174/1573396318666220705154424.
    [19]
    Dreiseitl S, Ohno-Machado L. Logistic regression and artificial neural network classification models: a methodology review[J]. J Biomed Inform, 2002, 35(5-6): 352-359. DOI: 10.1016/S1532-0464(03)00034-0.
    [20]
    Vapnik VN, Kotz S. Estimation of dependences based on empirical data[M]. New York: Springer Science & Business Media, 2006: 232-457.
  • Relative Articles

    [1]BAO Ya-wei, SHAO Ming, CHEN Yu-ting, LIU Xu-xiang, DING Xiao-qin, PAN Gui-xia, PAN Fa-ming, LI Xiao-jing. Application of autoregressive integrated moving average (ARIMA) model in global prediction of COVID-19 incidence[J]. CHINESE JOURNAL OF DISEASE CONTROL & PREVENTION, 2020, 24(5): 543-548. doi: 10.16462/j.cnki.zhjbkz.2020.05.010
    [2]XUN Lu-ning, ZHANG Fan, SUN Ji-xin, CAO Ya-jing, SUN Zhen, SHI Wei-wei, LI Mei, CUI Ze. Prediction and analysis of road traffic injury death trend based on ARIMA model[J]. CHINESE JOURNAL OF DISEASE CONTROL & PREVENTION, 2020, 24(4): 467-472. doi: 10.16462/j.cnki.zhjbkz.2020.04.019
    [3]LI Gang-gang, ZHOU Xiu-fang, BAI Ya-na, ZHOU Li, HAN Xiao-li, REN Xiao-wei. Application and comparison of residual autoregressive model and Holt's two-parameter exponential smoothing model in infant mortality prediction in some countries along the Belt and Road Initiative[J]. CHINESE JOURNAL OF DISEASE CONTROL & PREVENTION, 2019, 23(1): 90-94, 100. doi: 10.16462/j.cnki.zhjbkz.2019.01.019
    [4]ZHU Gao-pei, ZHU Le-le, MENG Ma-cheng, WU Xue-sen. Application of zero-inflated negative binomial regression model in study of the impacting factors about multimorbidity[J]. CHINESE JOURNAL OF DISEASE CONTROL & PREVENTION, 2018, 22(10): 1063-1066. doi: 10.16462/j.cnki.zhjbkz.2018.10.020
    [5]WANG Ya-wen, SHEN Zhong-zhou, YAN Bao-hu, YANG Yin. Application of ARIMA and hybrid ARIMA-GRNN models in forecasting AIDS incidence in China[J]. CHINESE JOURNAL OF DISEASE CONTROL & PREVENTION, 2018, 22(12): 1287-1290. doi: 10.16462/j.cnki.zhjbkz.2018.12.020
    [6]CHANG Jiang, MIAO Xiao-ping. Gene-environment analysis in the study of chronic non-communicable diseases[J]. CHINESE JOURNAL OF DISEASE CONTROL & PREVENTION, 2018, 22(4): 323-325. doi: 10.16462/j.cnki.zhjbkz.2018.04.001
    [7]CHENG Juan, LIANG Xuan, ZHENG Sen-shuang, WANG Jing, DING Lan-jun, WANG Yuan, LU Wen-li. Predictors of breast cancer screening utilization among female at high risk of developing breast cancer: application of a Lasso Logistic model[J]. CHINESE JOURNAL OF DISEASE CONTROL & PREVENTION, 2018, 22(6): 551-554,559. doi: 10.16462/j.cnki.zhjbkz.2018.06.003
    [8]SHAO Yan-tao, HUANG Dong-ping, LIU Shun, HUANG Qian, LU Mei-ju, GUO Xue-feng, CHEN Jie-hua, QIU Xiao-qiang. Impact of maternal pre-pregnant body mass index, gestational weight gain and pregnant anemia on SGA in Zhuang Region[J]. CHINESE JOURNAL OF DISEASE CONTROL & PREVENTION, 2018, 22(7): 663-666. doi: 10.16462/j.cnki.zhjbkz.2018.07.003
    [9]HUANG Zhi-feng, LIU Xiao-jian, WU Yong-sheng, YANG Lian-peng, ZOU Yu-hua, LI Ye, CAI Yun-peng. Application of Serfling cyclical regression model in the estimation of influenza-associated excess mortality in Shenzhen[J]. CHINESE JOURNAL OF DISEASE CONTROL & PREVENTION, 2017, 21(11): 1170-1174. doi: 10.16462/j.cnki.zhjbkz.2017.11.022
    [10]CHEN Fei, HUANG Jing, ZHANG Lian-sheng. Influencing factors of self-injury behavior among undergraduates by two-level binary Logistic regression model[J]. CHINESE JOURNAL OF DISEASE CONTROL & PREVENTION, 2017, 21(4): 387-390. doi: 10.16462/j.cnki.zhjbkz.2017.04.016
    [11]WU Jiao, LUO Yan-hong, GUO Xing-ping, SONG Chun-ying, CAO Hong-yan, ZHANG Yan-bo. Analysis of the impact of the diseases during the pregnancy on birth defects with random effects Logistic model[J]. CHINESE JOURNAL OF DISEASE CONTROL & PREVENTION, 2016, 20(2): 146-148,157. doi: 10.16462/j.cnki.zhjbkz.2016.02.010
    [12]WANG Yong-bin, CHAI Feng, LI Xiang-wen, YUAN Ju-xiang, WU Jian-hui. Application of ARIMA model and auto-regressive model in prediction on incidence of hand-foot-mouth disease[J]. CHINESE JOURNAL OF DISEASE CONTROL & PREVENTION, 2016, 20(3): 303-306. doi: 10.16462/j.cnki.zhjbkz.2016.03.022
    [13]YAN Ruo-hua, LI Wei, GU Hong-qiu, WANG Yang. Calculation of C statistics for the Cox proportional hazards regression models and its implementation in SAS[J]. CHINESE JOURNAL OF DISEASE CONTROL & PREVENTION, 2016, 20(9): 953-956,961. doi: 10.16462/j.cnki.zhjbkz.2016.09.023
  • 加载中
    Created with Highcharts 5.0.7Chart context menuAccess Class DistributionFULLTEXT: 32.5 %FULLTEXT: 32.5 %META: 58.0 %META: 58.0 %PDF: 9.5 %PDF: 9.5 %FULLTEXTMETAPDF
    Created with Highcharts 5.0.7Chart context menuAccess Area Distribution其他: 9.3 %其他: 9.3 %三明: 0.3 %三明: 0.3 %上海: 1.5 %上海: 1.5 %东莞: 0.3 %东莞: 0.3 %乌鲁木齐: 0.5 %乌鲁木齐: 0.5 %佛山: 0.5 %佛山: 0.5 %北京: 5.7 %北京: 5.7 %十堰: 0.5 %十堰: 0.5 %南京: 2.1 %南京: 2.1 %厦门: 0.8 %厦门: 0.8 %台州: 4.4 %台州: 4.4 %合肥: 1.0 %合肥: 1.0 %嘉兴: 0.3 %嘉兴: 0.3 %天津: 1.0 %天津: 1.0 %太原: 5.4 %太原: 5.4 %宁波: 0.5 %宁波: 0.5 %巴里: 1.0 %巴里: 1.0 %广州: 1.3 %广州: 1.3 %开罗: 1.0 %开罗: 1.0 %张家口: 0.5 %张家口: 0.5 %德罕: 0.8 %德罕: 0.8 %德里: 0.5 %德里: 0.5 %成都: 0.3 %成都: 0.3 %扬州: 0.8 %扬州: 0.8 %无锡: 0.3 %无锡: 0.3 %普洱: 1.0 %普洱: 1.0 %杭州: 4.1 %杭州: 4.1 %柳州: 0.3 %柳州: 0.3 %武汉: 0.5 %武汉: 0.5 %沈阳: 0.3 %沈阳: 0.3 %海得拉巴: 0.8 %海得拉巴: 0.8 %深圳: 0.3 %深圳: 0.3 %温州: 0.5 %温州: 0.5 %湖州: 3.6 %湖州: 3.6 %漯河: 0.8 %漯河: 0.8 %维沙卡帕特南: 0.8 %维沙卡帕特南: 0.8 %罗奥尔凯埃: 0.3 %罗奥尔凯埃: 0.3 %芒廷维尤: 15.2 %芒廷维尤: 15.2 %芝加哥: 0.8 %芝加哥: 0.8 %苏州: 0.3 %苏州: 0.3 %衢州: 1.3 %衢州: 1.3 %西宁: 19.3 %西宁: 19.3 %运城: 1.3 %运城: 1.3 %遵义: 0.3 %遵义: 0.3 %郑州: 4.6 %郑州: 4.6 %金华: 1.0 %金华: 1.0 %锦州: 0.3 %锦州: 0.3 %长沙: 0.5 %长沙: 0.5 %长治: 0.5 %长治: 0.5 %首尔特别: 0.3 %首尔特别: 0.3 %麻坡: 0.8 %麻坡: 0.8 %其他三明上海东莞乌鲁木齐佛山北京十堰南京厦门台州合肥嘉兴天津太原宁波巴里广州开罗张家口德罕德里成都扬州无锡普洱杭州柳州武汉沈阳海得拉巴深圳温州湖州漯河维沙卡帕特南罗奥尔凯埃芒廷维尤芝加哥苏州衢州西宁运城遵义郑州金华锦州长沙长治首尔特别麻坡

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(1)  / Tables(3)

    Article Metrics

    Article views (224) PDF downloads(36) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return