Evaluation of the control effect for measured confounders between propensity score matching and disease risk score matching
-
摘要:
目的 比较在组间倾向性评分(propensity score, PS)重叠较好和较差的场景下应用PS和疾病风险评分(disease risk score, DRS)进行1∶1匹配的效果,同时探索DRS匹配的最优卡钳值。 方法 设置不同的试验组样本量占比、结局事件发生率和PS重叠情况,模拟6种场景比较PS和DRS匹配前后的协变量均衡性和处理效应估计偏差,并进行实例分析。 结果 PS重叠较好的场景下,DRS重叠也较好,PS匹配优于DRS,PS匹配最优卡钳值为标准差的10%~20%,DRS匹配相对最优卡钳值为标准差的0.5%。在PS重叠较差的场景中,DRS重叠也变差,但DRS匹配优于PS,DRS匹配最优卡钳值为标准差的15%~20%。此外,PS和DRS匹配对协变量均衡性的改善效果与处理效应估计偏差相一致。 结论 当PS重叠较好时,优选PS匹配;当PS重叠较差时,可选DRS,其最优卡钳值为标准差的15%~20%。在实际应用中,可根据匹配前后组间协变量均衡性指标的改善情况评价匹配效果。 Abstract:Objective To compare propensity score (PS) matching and disease risk score (DRS) matching in the scenarios of good PS overlap and poor PS overlap, and to investigate the optimal caliper width for DRS matching. Methods According to the different proportions of the test group and events as well as the different PS overlap situations, 6 scenarios were simulated to compare the balance of covariables and the bias before and after PS matching or DRS matching, followed by analysis of an actual case. Results In the scenarios with good PS overlap, the DRS overlap was also good, and PS matching was more accurate than DRS matching. The optimal caliper width was found to be 10%-20% of the PS standard deviation (SD) for PS matching, and the relative optimal caliper width was 0.5% of the DRS SD for DRS matching. In the scenarios with poor PS overlap, the DRS overlap was also poor, but the DRS matching was more accurate than PS matching. The optimal caliper width was found to be 15%-20% of the DRS SD. In addition, the improvement on the balance of covariates was consistent with the estimation bias of treatment effect. Conclusions When the overlap of PS is good, PS matching is preferred; when the overlap of PS is poor, DRS matching can be selected, and the optimal caliper width is 15%-20% of the DRS SD. In practical application, the control effect for measured a confounder can be evaluated according to the improvement on the covariable balance between groups. -
Key words:
- Propensity score /
- Disease risk score /
- Matching /
- Confounding bias
-
图 1 不同场景设置下组间PS和DRS分布情况
PS:倾向性评分;DRS:疾病风险评分;A: 场景A1组间PS分布;B: 场景A1组间DRS分布;C: 场景A2组间PS分布;D: 场景A2组间DRS分布;E: 场景A3组间PS分布;F: 场景B3组间DRS分布;G: 场景B1组间PS分布;H: 场景B1组间DRS分布;I: 场景B2组间PS分布;J: 场景B2组间DRS分布;K: 场景B3组间PS分布;L: 场景B3组间DRS分布。
Figure 1. Distribution of PS and DRS between groups under different scenarios
PS: propensity score; DRS: disease risk score; A: PS distribution between groups in scenario A1; B: DRS distribution between groups in scenario A1; C: PS distribution between groups in scenario A2; D: DRS distribution between groups in scenario A2; E: PS distribution in scenario A3; F: DRS distribution between groups in scenario B3; G: PS distribution between groups in scenario B1; H: DRS distribution between groups in scenario B1; I: PS distribution between groups in scenario B2; J: DRS distribution between groups in scene B2; K: PS distribution in scenario B3; L: DRS distribution between groups in scenario B3.
图 2 不同场景匹配前后组间各协变量标准化差异绝对值
PS:倾向性评分,DRS:疾病风险评分,SD:标准化差异;场景A1~A3中PS卡钳值设置为标准差的20%,DRS卡钳值设置为标准差的0.05%;场景B1~B3中PS和DRS卡钳值均设置为标准差的20%。
Figure 2. Absolute value of standardized difference of covariables between two groups before and after matching under different scenarios
PS: propensity score, DRS: disease risk score, SD: standardized difference; In the scenario A1-A3, the caliper width of PS matching is set to 20% of the standard deviation, and the caliper width of DRS is set to 0.05% of the standard deviation; both the caliper widths of PS matching and DRS matching in the scene B1-B3 are set to 20% of the standard deviation.
图 3 PS与DRS分布及匹配效果
PS:倾向性评分;DRS:疾病风险评分;APACHE Ⅲ: 急性生理与慢性健康Ⅲ评分;A: 匹配前组间DRS分布;B: 匹配前组间DRS分布;C: 匹配前后组间各协变量标准化差异绝对值。
Figure 3. Distribution and matching effect of PS and DRS
PS: propensity score, DRS: disease risk score; A: distribution of PS between group; APACHE Ⅲ: acute physiology and chronic health evaluation Ⅲ; B: distribution of DRS between group; C: absolute value of standardized difference of covariables between two groups before and after matching.
表 1 协变量与分组因素和结局的关联
Table 1. The correlation with smoking and outcome
与吸烟关联
Association with smoking与某疾病关联
Association with a certain disease有关
Associate无关
Not associate有关 Associate x1, x2, x3, b1, b2, b3, b4, b5 x5, b6, b7 无关 Not associate x4, b8, b9 x6, b10 表 2 模拟试验场景参数设置
Table 2. Parameter setting of simulation scenarios
场景
Scenarios组间PS分布重叠
PS overlap发生疾病的比例/%
Disease incident /%分组模型系数设置
Grouping model coefficient setting
c(α0, α1, α2, α3, α4, α5, α6, α7, α8, α9, α10, α11)结局模型系数设置
Outcome model coefficient setting
c(β0, β1, β2, β3, β4, β5, β6, β7, β8, β9, β10, β11)A1 较好 Good 30 c(-1.5, -0.6, 0.5, 0.5, -0.3, 0.6, -0.2, 0.2, 0.3, 0.5, -0.5, -0.2) c(-1.5, 0.5, -0.5, 0.5, -0.2, -0.5, 0.6, 0.5, 0.2, -0.5, 0.5, -0.2) A2 较好 Good 20 c(-1.5, -0.6, 0.5, 0.5, -0.3, 0.6, -0.2, 0.2, 0.3, 0.5, -0.5, -0.2) c(-2.1, 0.5, -0.5, 0.5, -0.2, -0.5, 0.6, 0.5, 0.2, -0.5, 0.5, -0.2) A3 较好 Good 10 c(-1.5, -0.6, 0.5, 0.5, -0.3, 0.6, -0.2, 0.2, 0.3, 0.5, -0.5, -0.2) c(-3, 0.5, -0.5, 0.5, -0.2, -0.5, 0.6, 0.5, 0.2, -0.5, 0.5, -0.2) B1 较差 Poor 30 c(-1.5, 1.5, -1.2, 1.5, -1.1, -1.2, 0.5, -1, 1.5, -1.2, 1.6, -1.1) c(-1.5, 0.5, -0.5, 0.5, -0.2, -0.5, 0.6, 0.5, 0.2, -0.5, 0.5, -0.2) B2 较差 Poor 20 c(-1.5, 1.5, -1.2, 1.5, -1.1, -1.2, 0.5, -1, 1.5, -1.2, 1.6, -1.1) c(-2.2, 0.5, -0.5, 0.5, -0.2, -0.5, 0.6, 0.5, 0.2, -0.5, 0.5, -0.2) B3 较差 Poor 10 c(-1.5, 1.5, -1.2, 1.5, -1.1, -1.2, 0.5, -1, 1.5, -1.2, 1.6, -1.1) c(-3, 0.5, -0.5, 0.5, -0.2, -0.5, 0.6, 0.5, 0.2, -0.5, 0.5, -0.2) 注:PS,倾向性评分。
Note: PS, propensity score.表 3 不同场景下模拟试验结果
Table 3. Simulation results under different scenarios
场景
Scenario匹配方法
Matching method卡钳值/%
Caliper width/%匹配比例/%
Matching ratio /%RB/% MSE/% 95% CI覆盖率/%
95% CI coverage rate/ %检验效能/%
Power/%场景A1 Scenario A1 PS法 PS method 30 97.0 10.9 6.1 72.8 99.5 25 95.2 10.2 5.8 76.2 99.8 20 93.1 9.4 5.4 79.6 99.9 15 90.8 8.7 5.1 82.9 99.9 10 88.5 8.1 4.9 85.5 99.9 5 86.5 7.6 4.8 87.0 99.9 DRS法 DRS method 30 100.0 13.6 7.1 58.7 99.7 25 100.0 13.6 7.1 58.7 99.7 20 100.0 13.6 7.1 58.7 99.7 15 100.0 13.6 7.1 58.7 99.7 10 100.0 13.6 7.1 58.7 99.7 5 99.9 13.6 7.1 58.7 99.7 0.5 99.8 13.6 7.1 59.1 99.6 0.05 93.8 13.1 6.9 66.6 99.4 0.01 61.7 12.0 8.1 81.0 92.7 场景A2 Scenario A2 PS法 PS method 30 94.3 12.6 7.6 69.6 97.6 25 91.2 11.3 7.1 76.3 98.2 20 87.6 9.8 6.5 79.0 98.5 15 84.0 8.5 6.1 84.1 99.1 10 80.6 7.3 5.9 87.8 99.1 5 78.1 6.4 5.8 90.1 99.6 DRS法 DRS method 30 100.0 13.6 7.1 58.7 99.7 25 100.0 13.6 7.1 58.7 99.7 20 100.0 13.6 7.1 58.7 99.7 15 100.0 13.6 7.1 58.7 99.7 10 100.0 13.6 7.1 58.7 99.7 5 99.9 14.0 7.9 61.3 98.4 0.5 99.7 13.6 6.3 62.1 98.4 0.05 94.4 13.3 7.8 68.0 97.8 0.01 68.2 12.3 9.3 81.2 88.5 场景A3 Scenario A3 PS法 PS method 30 94.3 10.3 10.1 84.0 86.2 25 91.2 9.0 9.8 86.9 87.8 20 87.6 7.5 9.7 89.7 89.5 15 84.0 6.0 9.8 92.1 89.9 10 80.6 4.7 10.0 92.0 90.5 5 78.1 3.9 10.2 93.2 91.0 DRS法 DRS method 30 100.0 13.6 7.1 58.7 99.7 25 100.0 13.6 7.1 58.7 99.7 20 100.0 13.6 7.1 58.7 99.7 15 100.0 13.6 7.1 58.7 99.7 场景A3 Scenario A3 DRS法 DRS method 10 100.0 13.6 7.1 58.7 99.7 5 99.9 12.3 9.7 82.4 85.3 0.5 99.8 12.1 9.7 82.7 85.8 0.05 97.0 11.4 10.0 85.1 83.4 0.01 81.8 10.4 12.5 87.7 72.1 场景B1 Scenario B1 PS法 PS method 30 99.9 54.5 88.8 0.0 100.0 25 90.1 40.9 52.7 0.9 100.0 20 76.7 25.7 24.6 25.7 100.0 15 65.1 13.5 11.3 75.3 100.0 10 55.6 4.3 6.6 94.2 100.0 5 48.7 2.5 6.1 94.5 99.1 DRS法 DRS method 30 92.3 9.4 6.0 81.7 100.0 25 88.7 3.5 3.6 95.0 100.0 20 84.8 2.4 3.0 96.6 100.0 15 80.7 7.7 4.2 84.9 100.0 10 76.8 12.2 6.5 66.5 99.9 5 73.8 15.2 8.6 52.7 99.0 0.5 71.9 16.4 9.7 45.3 97.5 0.05 63.5 16.3 9.9 51.9 95.3 0.01 40.0 16.2 11.4 70.9 76.0 场景B2 Scenario B2 PS法 PS method 30 99.9 60.4 109.1 0.0 100.0 25 90.1 45.7 66.2 1.5 100.0 20 76.7 29.8 32.6 23.8 100.0 15 65.1 17.2 16.0 68.9 100.0 10 55.6 7.2 8.9 92.9 99.8 5 48.7 0.4 7.7 95.9 98.2 DRS法 DRS method 30 92.6 13.5 9.4 71.1 100.0 25 89.4 7.3 5.4 92.4 100.0 20 85.7 0.8 3.5 97.6 100.0 15 81.6 5.4 3.8 94.4 100.0 10 77.5 10.7 5.9 80.6 99.2 5 74.1 14.3 8.3 65.7 97.1 0.5 72.0 15.7 9.4 59.9 94.7 0.05 64.3 15.6 10.0 65.8 88.2 0.01 44.1 15.2 12.5 75.1 68.0 场景B3 Scenario B3 PS法 PS method 30 99.9 68.9 148.0 0.1 100.0 25 90.1 52.7 92.7 4.0 100.0 20 76.7 35.5 49.6 31.5 100.0 15 65.1 22.0 27.6 70.8 99.9 10 55.7 11.1 17.2 90.2 98.9 5 48.7 2.7 14.2 95.4 91.3 DRS法 DRS method 30 95.7 25.1 26.3 45.4 100.0 25 93.3 18.7 17.4 68.8 100.0 20 90.1 11.1 10.1 87.6 100.0 15 86.0 2.7 6.1 96.5 100.0 10 81.0 5.3 5.8 95.0 98.5 5 75.9 11.9 8.3 84.6 89.5 0.5 72.6 14.5 10.2 76.0 80.9 0.05 66.8 14.0 10.7 80.6 76.1 0.01 51.0 13.7 14.6 85.8 55.5 注:PS,倾向性评分;DRS,疾病风险评分;RB,相对偏倚;MSE,均方误差。
Note:PS,propensity score;DRS,disease risk score;RB,relative bias;MSE,mean-square error.表 4 不同匹配法的分析结果
Table 4. Analysis results of different matching methods
方法 Methods 匹配比例/%
Matching ratio/%60 d死亡人数(占比/%)
Number of deaths within 60 days (proportion/%)OR值 value
(95% CI)P值 value 非血小板减少组
Non-thrombocytopenic group血小板减少组
Thrombocytopenia groupPS匹配 PS matching 82.3 27(30.3) 42(47.2) 2.05(1.12~3.82) 0.022 DRS匹配 DRS matching 92.0 39(37.5) 53(50.9) 1.73(0.99~3.02) 0.051 注:PS,倾向性评分;DRS,疾病风险评分。
Note: PS, propensity score;DRS, disease risk score. -
[1] Paul, R, Rosenbaum, et al. The central role of the propensity score in observational studies for causal effects [J]. Biometrika, 1983, 70(1): 41-55. DOI: 10.1093/biomet/70.1.41. [2] Miettinen OS. Stratification by a multivariate confounder score [J]. Am J Epidemiol, 1976, 104(6): 609-620. DOI: 10.1080/0002889768507553. [3] Arbogast PG, Ray WA. Performance of disease risk scores, propensity scores, and traditional multivariable outcome regression in the presence of multiple confounders [J]. Am J Epidemiol, 2011, 174(5): 613-620. DOI: 10.1093/aje/kwr143. [4] Wang Y, Cai H, Li C, et al. Optimal caliper width for propensity score matching of three treatment groups: a Monte Carlo study [J]. PLoS One, 2013, 8(12): e81045. DOI: 10.1371/journal.pone.0081045. [5] Elze MC, Gregson J, Baber U, et al. Comparison of propensity score methods and covariate adjustment [J]. J Am Coll Cardiol, 2017, 69(3): 345-347. DOI: 10.1016/j.jacc.2016.10.060. [6] Benedetto U, Head SJ, Angelini GD, et al. Statistical primer: propensity score matching and its alternatives [J]. Eur J Cardiothorac Surg, 2018, 53(6): 1112-1117. DOI: 10.1093/ejcts/ezy167. [7] Austin PC. Optimal caliper widths for propensity-score matching when estimating differences in means and differences in proportions in observational studies [J]. Pharm Stat, 2011, 10(2): 150-161. DOI: 10.1002/pst.433. [8] 黄丽红, 王永吉, 王素珍, 等. 倾向性评分方法及其规范化应用的统计学共识CSCO生物统计学专家委员会RWS方法学组[J]. 中国卫生统计, 2020, 37(6): 952-958. DOI: 10.3969/j.issn.1002-3674.2020.06.041.Huang LH, Wang YJ, Wang SZ, et al. Statistical consensus on propensity score method and its standardized application [J]. Chinese Journal of Health Statistics, 2020, 37(6): 952-958. DOI: 10.3969/j.issn.1002-3674.2020.06.041. [9] Connolly JG, Gagne JJ. Comparison of calipers for matching on the disease risk score [J]. Am J Epidemiol, 2016, 183(10): 937-948. DOI: 10.1093/aje/kwv302. [10] 黄丽红, 赵杨, 魏永越, 等. 如何控制观察性疗效比较研究中的混杂因素: (一)已测量混杂因素的统计学分析方法[J]. 中华流行病学杂志, 2019, 40(12): 1645-1649. DOI: 10.3760/cma.j.issn.0254-6450.2019.10.024.Huang LH, Zhao Y, Wei YY, et al. Confounder adjustment in observational comparative effectiveness researches: (1) statistical adjustment approaches for measured confounder [J]. Chin J Epidemiol, 2019, 40(12): 1645-1649. DOI: 10.3760/cma.j.issn.0254-6450.2019.10.024. [11] Zhang D, Kim J. Use of propensity score and disease risk score for multiple treatments with time-to-event outcome: a simulation study [J]. J Biopharm Stat, 2019, 29(6): 1103-1115. DOI: 10.1080/10543406.2019.1584205. [12] Wyss R, Ellis AR, Brookhart MA, et al. Matching on the disease risk score in comparative effectiveness research of new treatments [J]. Pharmacoepidemiol Drug Saf, 2015, 24(9): 951-961. DOI: 10.1002/pds.3810. [13] Desai RJ, Glynn RJ, Wang S, et al. Performance of disease risk score matching in nested case-control studies: a simulation study [J]. Am J Epidemiol, 2016, 183(10): 949-957. DOI: 10.1093/aje/kwv269. [14] Li Y, Li L. Propensity score analysis methods with balancing constraints: a monte c arlo study [J]. Stat Methods Med Res, 2021, 30(4): 1119-1142. DOI: 10.1177/0962280220983512. [15] 黄丽红, 陈峰. 倾向性评分方法及其应用[J]. 中华预防医学杂志, 2019, 53(7): 752-756. DOI: 10.3760/cma.j.issn.0253-9624.2019.07.017.Huang LH, Chen F. The propensity score method and its application [J]. Chin J Prev Med, 2019, 53(7): 752-756. DOI: 10.3760/cma.j.issn.0253-9624.2019.07.017. [16] Reiffel JA. Propensity-score matching: the "devil is in the details" where more may be hidden than you know [J]. Am J Med, 2020, 133(2): 178-181. DOI: 10.1016/j.amjmed.2019.08.055. -