Trend analysis of COVID-19 incidence and death series based on Bayesian change point model
-
摘要:
目的 基于COVID-19发病和死亡序列趋势分析,探讨评价传染病发病或死亡变化趋势的方法,为同类流行病学研究数据的分析提供可能的分析策略。 方法 提取中国31个省(自治区、直辖市)2020年1月23日至2020年3月18日的COVID-19累计确诊病例和累计死亡病例数据,基于贝叶斯变点分析模型确定时间序列变点,并应用间断时间序列(interrupted time series,ITS)构建分段线性回归(segmented linear regression,SLR)模型,评价序列变化趋势与干预措施、政策的一致性。 结果 武汉市累计确诊病例和累计死亡病例数据各有3次变点,湖北省(除武汉市)和除湖北省以外的30个省(自治区、直辖市)的确诊病例数、死亡病例数各有4次变点。武汉市累计确诊病例数3次变点后改变量分别为1 493.885(P<0.001)、2 444.913(P<0.001)、-4 061.038(P<0.001);累计死亡病例数第2次、第3次变点后改变量分别为-66.917(P<0.001)、-19.845(P=0.034)。湖北省(除武汉市)累计确诊病例数第3次变点出现增幅降低且差异有统计学意义,改变量为-845.244(P<0.001);累计死亡病例数增幅降低出现在第3次、第4次变点,斜率改变量分别为-10.062(P<0.001)、-12.245(P<0.001)。除湖北省以外的30个省(自治区、直辖市)累计确诊病例数第2次变点后开始出现增幅降低,改变量分别为-281.494(P<0.001)、-295.080(P<0.001)、-145.054(P<0.001);累计死亡病例数差异有统计学意义的增幅降低出现在第3次、第4次变点,斜率改变量分别为-3.199(P<0.001)、-1.706(P<0.001)。 结论 结合贝叶斯变点分析和ITS分析可充分考虑时间序列趋势变化的不确定性,为传染病疫情分析和防控措施评价提供依据。 Abstract:Objective To analyze the trend of COVID-19 based on the trend analysis of incidence and mortality data and provide analysis strategies for similar epidemiological researches. Methods We used the Bayesian change point analysis model to obtain the time series change points based on the number of cumulative confirmed and cumulative death cases of COVID-19 from January 23, 2020 to March 18, 2020 in Chinese mainland. Interrupted time series (ITS) method was applied to build a segmented linear regression (SLR) model, evaluating the consistency of trends in the series with the intervention or policy. Results There were 3 change points in cumulative confirmed cases and deaths in Wuhan, and 4 change points in cumulative confirmed cases and deaths in Hubei Province (except Wuhan) and Chinese mainland (except Hubei Province). The changes in the number of cumulative confirmed cases in Wuhan after 3 change points were 1 493.885 (P < 0.001), 2 444.913 (P < 0.001) and -4 061.038 (P < 0.001), respectively. The number of cumulative deaths after the second and third change points were -66.917 (P < 0.001) and -19.845 (P=0.034), respectively. The increase in the number of cumulative confirmed cases in Hubei Province (except Wuhan) began to decrease after the third change point, and the change is -845.244 (P < 0.001). The the increase in the number of cumulative deaths decreased after the third and fourth change points, and the slope changes were -10.062 (P < 0.001) and -12.245 (P < 0.001), respectively. The increase in the number of cumulative confirmed cases in Chinese mainland decreased from the second change point, and the changes were -281.494 (P < 0.001), -295.080 (P < 0.001), -145.054 (P < 0.001), respectively. The statistically significant decrease in the increase of cumulative deaths appeared in the third and fourth change points, and the slope changes were -3.199 (P < 0.001) and -1.706 (P < 0.001), respectively. Conclusions The combination of interrupted time series analysis with Bayesian change point analysis can consider the uncertainty of time series trend changes, and provide a basis for epidemiological analysis of infectious diseases and evaluation of prevention and control measures. -
表 1 贝叶斯变点分析后验概率
Table 1. Posterior probability of Bayesian change point analysis
组别 武汉市 湖北省(除武汉市) 除湖北省以外的30个省(自治区、直辖市) 日期(节点) P值 日期(节点) P值 日期(节点) P值 确诊病例数 2.11(19)a 1.000 2.20(28)a 0.976 2.06(14)a 0.974 2.01(9)a 0.687 2.11(19)a 0.540 1.27(4)a 0.876 2.12(20)a 0.651 2.15(23)a 0.519 2.12(20)a 0.726 2.17(25)a 0.649 2.06(14)a 0.512 2.19(27)a 0.587 3.01(38) 0.383 1.27(4) 0.426 2.13(21) 0.211 死亡病例数 2.11(19)a 1.000 2.16(24)a 0.727 2.11(19)a 0.994 2.13(21)a 0.643 1.28(5)a 0.704 2.05(13)a 0.765 2.23(31)a 0.538 2.23(31)a 0.560 2.18(26)a 0.733 3.05(42)a 0.509 2.08(16)a 0.512 2.27(35)a 0.554 2.22(30) 0.464 3.06(43) 0.485 2.28(36) 0.271 注:a变点,设定1月23日的节点为0,1月24日为1,依次类推。 表 2 分段条件检验结果
Table 2. Conditional test results of segmented data
组别 第一段 第二段 第三段 第四段 第五段 Z值 χ2值 DW Z值 χ2值 DW Z值 χ2值 DW Z值 χ2值 DW Z值 χ2值 DW 武汉市 确诊病例数 0.403 0.150 0.984 0.621 1.110 1.959 -0.443 1.640 1.458 1.140 0.480 0.060 死亡病例数 1.614 0.053 0.120 1.912 a 5.420 a 1.648 0.721 0.150 1.744 0.474 1.420 0.591 湖北省(除武汉市) 确诊病例数 1.004 0.340 0.208 0.434 0.460 1.559 1.404 0.020 2.008 1.233 0.240 2.760 4.530 a 26.970 a 1.961 死亡病例数 -1.043 1.230 2.598 0.998 0.710 1.646 1.069 1.060 2.285 -0.161 0.070 1.701 0.965 0.010 0.124 除湖北省以外的30个省(自治区、直辖市) 确诊病例数 0.544 0.050 1.424 0.092 0.830 1.424 1.164 0.000 1.213 0.246 0.240 0.835 -2.145 0.280 0.636 死亡病例数 2.395 a 2.080 1.155 -0.577 0.030 1.300 -1.095 0.000 1.415 -0.088 0.990 1.945 -0.473 0.010 1.662 注:a P<0.05。 表 3 中国确诊与死亡序列不同变点即刻改变量和转折后斜率改变量估计
Table 3. The estimation of level change and slope change at different change points in the incidence and mortality series in China
组别 β0值 β1值 β2值 β3值 β4值 β5值 β6值 β7值 β8值 β9值 DW R2值 武汉市 确诊病例数 263.683 312.275 252.238 1 493.885 a 1 676.628 2 444.913 a -3 001.610 -4 061.038 a 1.4840 b 0.9849 死亡病例数 -45.334 41.111 a 48.720 57.623 a 33.040 -66.917 a -8.276 -19.845 a 0.8839 b 0.9778 湖北省(除武汉市) 确诊病例数 -321.995 735.230 a 243.904 -11.136 -164.362 193.091 -602.376 a -845.244 a 255.734 -52.994 0.5660 b 0.6949 死亡病例数 3.135 3.585 2.508 9.838 a 4.720 14.050 a -3.243 -10.062 a 0.850 -12.245 a 0.3333 b 0.8525 除湖北省以外的30个省(自治区、直辖市) 确诊病例数 280.696 a 351.713 a 113.702 382.756 a -8.556 -281.494 a -146.379 -295.080 a -52.547 -145.054 a 1.2365 b 0.9993 死亡病例数 1.060 0.954 a -0.270 4.988 a 0.188 -0.642 0.897 -3.199 1.803 -1.706 a 1.6918 c 0.9990 注:a P<0.05; b存在一阶自相关性; c不确定是否存在自相关。 -
[1] World Health Organization. WHO Coronavirus (COVID-19) Dashboard[EB/OL]. (2019-12-30)[2021-12-01]. https://covid19.who.int/table/. [2] Wagner AK, Soumerai SB, Zhang F, et al. Segmented regression analysis of interrupted time series studies in medication use research[J]. J Clin Pharm Ther, 2002, 27(4): 299-309. DOI: 10.1046/j.1365-2710.2002.00430.x. [3] 张晗希, 韩孟杰, 周郁, 等. 应用中断时间序列分析我国"四免一关怀"政策实施前后对艾滋病相关病死率的影响[J]. 中华流行病学杂志, 2020, 41(3): 406-411. DOI: 10.3760/cma.j.issn.0254-6450.2020.03.024.Zhang HX, Han MJ, Zhou Y, et al. Application of Interruption Time Series to Analyze the Impact of my country's "Four Frees and One Care" Policy on AIDS-related Mortality Rates Before and After Implementation[J]. Chin J Epidemiol, 2020, 41(3): 406-411. DOI: 10.3760/cma.j.issn.0254-6450.2020.03.024. [4] Barry D, Hartigan JA. A Bayesian-analysis for change point problems[J]. J Am Stat Assoc, 1993, 88(421): 309-319. DOI: 10.1080/01621459.1993.10594323. [5] Blankerl. 2019新型冠状病毒疫情时间序列数据仓库[EB/OL]. (2020-03-19)[2021-12-01]. https://github.com/BlankerL/DXY-COVID-19-Data.BlankerL. Time series database of 2019 novel coronavirus epidemic[EB/OL]. (2020-03-19)[2021-12-01]. https://github.com/BlankerL/DXY-COVID-19-Data. [6] Linden A. Conducting interrupted time-series analysis for single- and multiple-group comparisons[J]. Stata J, 2015, 15(2): 480-500. DOI: 10.1177/1536867×1501500208. [7] 沈卉卉. 自相关性的D-W检验与模型的改进[J]. 统计与决策, 2007, (23): 11-13. DOI: 10.3969/j.issn.1002-6487.2007.23.005.Shen HH. D-W test of autocorrelation and improvement of model[J]. Statistics and decision-making, 2007, (23):11-13. DOI: 10.3969/j.issn.1002-6487.2007.23.005.DOI:10.3969/j.issn.1002-6487.2007.23.005. [8] 丁莹, 张健钦, 杨木, 等. 新冠疫情城市仿真模型及防控措施评价-以武汉市为例[J]. 清华大学学报(自然科学版), 1-10. DOI: 10.16511/j.cnki.qhdxxb.2020.25.043.Ding Y, Zhang JQ, Yang M, et al. Communicable disease transmission model for the prevention and control of COVID-19 in Wuhan, China[J]. Journal of Tsinghua University (Natural Science Edition), 1-10. DOI: 10.16511/j.cnki.qhdxxb.2020.25.043. [9] 李伟炜, 杜蓉, 陈曙东, 等. 新型冠状病毒肺炎传播特性分析与疫情发展趋势预测[J]. 厦门大学学报(自然科学版), 2020, 59(6): 1025-1033. DOI: 10.6043/j.issn.0438-0479.202005016.Li WW, Du R, Chen SD, et al. Analysis of transmission characteristic of COVID-19 and prediction of the development trend of epidemic situation[J]. J Xiamen Univ (Nat Sci), 2020, 59(6): 1025-1033. DOI: 10.6043/j.issn.0438-0479.202005016. [10] 杨瑛莹, 詹思怡, 姜棋竞, 等. 中国258个城市新型冠状病毒肺炎时空分布特征研究[J]. 疾病监测, 2020, 35(11): 977-981. DOI: 10.3784/j.issn.1003-9961.2020.11.005.Yang YY, Zhan SY, Jiang JQ, et al. Spatiotemporal characteristics of coronavirus disease 2019 in 258 Cities in China[J]. Dis Surveill, 2020, 35(11): 977-981. DOI: 10.3784/j.issn.1003-9961.2020.11.005. [11] Chinazzi M, Davis JT, Ajelli M, et al. The effect of travel restrictions on the spread of the 2019 novel coronavirus[J]. Science, 2020, 368(6489): 395-400. DOI: 10.1126/science.aba9757. [12] 喻孜, 张贵清, 刘庆珍, 等. 基于时变参数-SIR模型的COVID-19疫情评估和预测[J]. 电子科技大学学报, 2020, 49(3): 357-361. DOI: 10.12178/1001-0548.2020027.Yu Z, Zhang GQ, Liu QZ, et al. The outbreak assessment and prediction of COVID-19 based on time-varying SIR model[J]. Journal of University of Electronic Science and Technology of China, 2020, 49(3): 357-361. DOI: 10.12178/1001-0548.2020027. [13] 王帮璇, 元永艇, 张丽, 等. 新型冠状病毒肺炎死亡病例变化趋势及其从发病到死亡时间的特征分析[J]. 蚌埠医学院学报, 2020, 45(2): 141-147. DOI: 10.13898/j.cnki.issn.1000-2200.2020.02.001.Wang BX, Yuan YT, Zhang L, et al. Trend of death cases of corona virus disease 2019 and its characteristic analysis from onset to death[J]. J Bengbu Med Coll, 2020, 45(2): 141-147. DOI: 10.13898/j.cnki.issn.1000-2200.2020.02.001. [14] Lipa J, Ma R, Cho YH. Change-Point Analysis: R and SAS Tutorial[EB/OL]. (2017-12-16)[2021-12-01]. https://jbhendergithubio/Stats506/F17/Projects/change_point.html. [15] Cheng VC, Tai JW, Chau PH, et al. Minimal intervention for controlling nosocomial transmission of methicillin-resistant staphylococcus aureus in resource limited setting with high endemicity[J]. PLoS One, 2014, 9(6): e100493. DOI: 10.1371/journal.pone.0100493. [16] Ver Hoef JM, Boveng PL. Quasi-Poisson vs. negative binomial regression: how should we model overdispersed[J]. Ecology, 2007, 88(11): 2766-2772. DOI: 10.1890/07-0043.1. [17] Gasparrini A, Gorini G, Barchielli A. On the relationship between smoking bans and incidence of acute myocardial[J]. Eur J Epidemiol, 2009, 24(10): 597-602. DOI: 10.1007/s10654-009-9377-0.