• 中国精品科技期刊
  • 《中文核心期刊要目总览》收录期刊
  • RCCSE 中国核心期刊(5/114,A+)
  • Scopus收录期刊
  • 美国《化学文摘》(CA)收录期刊
  • WHO 西太平洋地区医学索引(WPRIM)收录期刊
  • 《中国科学引文数据库(CSCD)》核心库期刊 (C)
  • 中国科技核心期刊
  • 中国科技论文统计源期刊
  • 《日本科学技术振兴机构数据库(中国)》(JSTChina)收录期刊
  • 美国《乌利希期刊指南》(UIrichsweb)收录期刊
  • 中华预防医学会系列杂志优秀期刊(2019年)

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

分布式循证因果数据融合方法进展

李洪凯 徐东海 刘青 季晓康 薛付忠

李洪凯, 徐东海, 刘青, 季晓康, 薛付忠. 分布式循证因果数据融合方法进展[J]. 中华疾病控制杂志, 2022, 26(10): 1174-1179. doi: 10.16462/j.cnki.zhjbkz.2022.10.011
引用本文: 李洪凯, 徐东海, 刘青, 季晓康, 薛付忠. 分布式循证因果数据融合方法进展[J]. 中华疾病控制杂志, 2022, 26(10): 1174-1179. doi: 10.16462/j.cnki.zhjbkz.2022.10.011
LI Hong-kai, XU Dong-hai, LIU Qing, JI Xiao-kang, XUE Fu-zhong. Advance in distributed evidence-based causal inference methods[J]. CHINESE JOURNAL OF DISEASE CONTROL & PREVENTION, 2022, 26(10): 1174-1179. doi: 10.16462/j.cnki.zhjbkz.2022.10.011
Citation: LI Hong-kai, XU Dong-hai, LIU Qing, JI Xiao-kang, XUE Fu-zhong. Advance in distributed evidence-based causal inference methods[J]. CHINESE JOURNAL OF DISEASE CONTROL & PREVENTION, 2022, 26(10): 1174-1179. doi: 10.16462/j.cnki.zhjbkz.2022.10.011

分布式循证因果数据融合方法进展

doi: 10.16462/j.cnki.zhjbkz.2022.10.011
基金项目: 

国家自然科学基金 82003557

国家重点研发计划 2020YFC2003500

详细信息
    通讯作者:

    季晓康, E-mail: jxk@sdu.edu.cn

    薛付忠, E-mail: xuefzh@sdu.edu.cn

  • 中图分类号: R181.2

Advance in distributed evidence-based causal inference methods

Funds: 

National Natural Science Foundation of China 82003557

National Key Research and Development Program 2020YFC2003500

More Information
  • 摘要: 为了实现大样本量和多样化的研究人群分析,整合来自多个异质来源的数据库已经变得越来越流行。本文综述了整合多个不同人群下的不同设计的数据库在因果推理方法方面的进展。尤其是随机临床试验与外部信息相结合的研究进展以及将观察性研究和历史对照相结合的方法。此外,针对单一样本缺乏相关混杂变量信息,也可以应用两样本孟德尔随机化方法控制未知的混杂因素从而推断因果关系。这种分布式数据设计具有有效性和真实世界数据研究的安全性。
  • 图  1  两个数据库融合的情形

    X表示暴露,Y表示结局,C表示混杂因素,α1α2表示混杂对暴露的因果效应,β1β2表示混杂对结局的直接效应。

    Figure  1.  The causal diagrams of two datasets

    图  2  K个数据库融合的情形

    Figure  2.  The causal diagrams of K datasets

    图  3  三个数据库融合的情形

    Figure  3.  The causal diagrams of three datasets

    图  4  无序中介和有序中介的孟德尔随机化研究

    Figure  4.  Mendelian randomization with non-ordered and ordered mediators

  • [1] Dahabreh IJ, Haneuse SJA, Robins JM, et al. Study Designs for Extending Causal Inferences From a Randomized Trial to a Target Population[J]. Am J Epidemiol, 2021, 190(8): 1632-1642. DOI: 10.1093/aje/kwaa270.
    [2] Ridder G, Moffitt R. The econometrics of data combination[J]. Handbook Econometrics, 2007, 6: 5469-5547. DOI: 10.1016/s1573-4412(07)06075-8.
    [3] Yang S, Kim JK. Statistical data integration in survey sampling: a review[J]. Jpn J Stat Data Sci, 2020, 3: 625-650. DOI: 10.1007/s42081-020-00093-w.
    [4] Angrist JD, Krueger AB. The effect of age at school entry on educational attainment: an application of instrumental variables with moments from two samples[J]. JASA, 1992, 87(418): 328-336. DOI: 10.2307/2290263.
    [5] Degtiar I, Rose S. A review of generalizability and transportability[J]. arXiv preprint arXiv: 2102.11904, 2021. DOI: 10.48550/arXiv.2102.11904.
    [6] Rubin, Donald B. Estimating causal effects of treatments in randomized and nonrandomized studies[J]. J Educ Psychol, 1974, 66(5): 688-701. DOI: 10.1037/h0037350.
    [7] Lesko CR, Buchanan AL, Westreich D, et al. Generalizing study results: a potential outcomes perspective[J]. Epidemiology, 2017, 28(4): 553-561. DOI: 10.1097/EDE.0000000000000664.
    [8] Stuart EA, Cole SR, Bradshaw CP, et al. The use of propensity scores to assess the generalizability of results from randomized trials[J]. J R Stat Soc Ser A Stat Soc, 2001, 174(2): 369-386. DOI: 10.1111/j.1467-985X.2010.00673.x.
    [9] Dahabreh IJ, Robertson SE, Steingrimsson JA, et al. Extending inferences from a randomized trial to a new target population[J]. Stat Med, 2020, 39(14): 1999-2014. DOI: 10.1002/sim.8426.
    [10] Cole SR, Stuart EA. Generalizing evidence from randomized clinical trials to target populations: The ACTG 320 trial[J]. Am J Epidemiol, 2010, 172(1): 107-115. DOI: 10.1093/aje/kwq084.
    [11] Colm O'Muircheartaigh, Larry V. Hedges. Generalizing from unrepresentative experiments: a stratified propensity score approach[J]. J R Stat Soc C-appl, 2014, 63(2): 195-210. DOI: 10.1111/rssc.12037.
    [12] Tipton E. Improving generalizations from experiments using propensity score subclassification: Assumptions, properties, and contexts[J]. JEBS, 2013, 38(3): 239-266. DOI: 10.3102/1076998612441947.
    [13] Hartman E, Grieve R, Ramsahai R, et al. From sample average treatment effect to population average treatment effect on the treated: combining experimental with observational studies to estimate population treatment effects[J]. J R Stat Soc A Stat, 2015, 178(3): 757-778. DOI: 10.1111/rssa.12094.
    [14] Rudolph KE, Laan MJ. Robust estimation of encouragement-design intervention effects transported across sites[J]. J R Stat Soc B, 2017, 79(5): 1509-1525. DOI: 10.1111/rssb.12213.
    [15] Daniel W, Edwards JK, Lesko CR, et al. Transportability of trial results using inverse odds of sampling weights[J]. Am J Epidemiol, 2017, 186(8): 1010-1014. DOI: 10.1093/aje/kwx164.
    [16] Ashley LB, Michael GH, Stephen RC, et al. Generalizing evidence from randomized trials using inverse probability of sampling weights[J]. J R Stat Soc A Stat, 2018, 181(4): 1193-1209. DOI: 10.1111/rssa.12357.
    [17] Dahabreh IJ, Hernán MA. Extending inferences from a randomized trial to a target population[J]. Eur J Epidemio, 2019, 34(8): 719-722. DOI: 10.1007/s10654-019-00533-2.
    [18] Toh S. Analytic and data sharing options in real-world multidatabase studies of comparative effectiveness and safety of medical products[J]. Clin Pharmacol Ther, 2020, 107(4): 834-842. DOI: 10.1002/cpt.1754.
    [19] Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects[J]. Biometrika, 1983, 70(1): 41-55. DOI: 10.1093/biomet/70.1.41.
    [20] Ben BH. The prognostic analogue of the propensity score[J]. Biometrika, 2008, 95(2): 481-488. DOI: 10.1093/biomet/asn004.
    [21] Cook EF, Goldman L. Performance of tests of significance based on stratification by a multivariate confounder score or by a propensity score[J]. J Clin Epidemiol, 1989, 42(4): 317-324. DOI: 10.1016/0895-4356(89)90036-x.
    [22] Rassen JA, Solomon DH, Curtis JR, et al. Privacy-Maintaining propensity score-based pooling of multiple databases applied to a study of biologics[J]. Med Care, 2010, 48(6 Suppl): S83-S39. DOI: 10.1097/MLR.0b013e3181d59541.
    [23] Shu D, Yoshida K, Fireman BH, et al. Inverse probability weighted Cox model in multi-site studies without sharing individual-level data[J]. Stat Methods Med Res, 2020, 29(6): 1668-1681. DOI: 10.1177/0962280219869742.
    [24] Toh S, Gagne JJ, Rassen JA, et al. Confounding adjustment in comparative effectiveness research conducted within distributed research networks[J]. Med Care, 2013, 51: S4-10. DOI: 10.1097/MLR.0b013e31829b1bb1.
    [25] Yoshida K, Gruber S, Fireman BH, et al. Comparison of privacy-protecting analytic and data-sharing methods: a simulation study[J]. Pharmacoepidemiol Drug Saf, 2018, 27(9): 1034-1041. DOI: 10.1002/pds.4615.
    [26] Hou L, Yu Y, Sun X, et al. Causal mediation analysis with multiple causally non-ordered and ordered mediators based on summarized genetic data[J]. Stat Methods Med Res, 2022, 31(7): 1263-1279. DOI: 10.1177/09622802221084599.
    [27] Toh S, Wellman R, Coley RY, et al. Combining distributed regression and propensity scores: a doubly privacy-protecting analytic method for multicenter research[J]. Clin Epidemiol, 2018, 10: 1773-1786. DOI: 10.2147/CLEP.S178163.
    [28] Li H, Miao W, Cai Z, et al. Causal data fusion methods using summary-level statistics for a continuous outcome[J]. Stat Med, 2020, 39(8): 1054-1067. DOI: 10.1002/sim.8461.
    [29] Li H, Jia J, Yan R, et al. A causal data fusion method for the general exposure and outcome. Stat Med, 2022, 41(2): 328-339. DOI: 10.1002/sim.9239.
    [30] Hobbs BP, Carlin BP, Mandrekar SJ, et al. Hierarchical commensurate and power prior models for adaptive incorporation of historical information in clinical trials[J]. Biometrics, 2011, 67(3): 1047-1056. DOI: 10.1111/j.1541-0420.2011.01564.x.
    [31] Kaizer AM, Koopmeiners JS, Hobbs BP. Bayesian hierarchical modeling based on multisource exchangeability[J]. Biostatistics, 2018, 19(2): 169-184. DOI: 10.1093/biostatistics/kxx031.
    [32] Boatman JA, Vock DM, Koopmeiners JS. Borrowing from supplemental sources to estimate causal effects from a primary data source[J]. Stat Med, 2021, 40(24): 5115-5130. DOI: 10.1002/sim.9114.
    [33] Gelman A, King G, Liu C. Not asked and not answered: Multiple imputation for multiple surveys[J]. JASA, 1998, 93(443): 846-857. DOI: 10.1080/01621459.1998.10473737.
    [34] Jackson CH, Best NG, Richardson S. Bayesian graphical models for regression on multiple data sets with different variables[J]. Biostatistics, 2009, 10(2): 335-351. DOI: 10.1093/biostatistics/kxn041.
    [35] Murray JS, Reiter JP. Multiple imputation of missing categorical and continuous values via Bayesian mixture models with local dependence[J]. JASA, 2016, 111(516): 1466-1479. DOI: 10.1080/01621459.2016.1174132.
    [36] Antonelli J, Zigler C, Dominici F. Guided Bayesian imputation to adjust for confounding when combining heterogeneous data sources in comparative effectiveness research[J]. Biostatistics, 2017, 18(3): 553-568. DOI: 10.1093/biostatistics/kxx003.
    [37] Comment L, Coull BA, Zigler C, et al. Bayesian data fusion: Probabilistic sensitivity analysis for unmeasured confounding using informative priors based on secondary data[J]. Biometrics. 2022, 78(2): 730-741. DOI: 10.1111/biom.13436.
    [38] Cooper GF, Yoo C. Causal discovery from a mixture of experimental and observational data[J]. arXiv preprint arXiv: 1301.6686, 2013. DOI: 10.5555/2073796.2073810.
    [39] Tian J, Pearl J. Causal discovery from changes[J]. arXiv preprint arXiv: 1301. 2013, 2312. DOI: 10.5555/2074022.2074085.
    [40] Eaton D, Murphy K. Exact Bayesian structure learning from uncertain interventions[C] // Artificial intelligence and statistics. PMLR, 2007: 107-114.
    [41] Peters J, Bühlmann P, Meinshausen N. Causal inference by using invariant prediction: Identification and confidence intervals[J]. J R Stat Soc B, 2016, 78: 947-1012. DOI: 10.1111/rssb.12167.
    [42] Zhang K, Huang B, Zhang J, et al. Causal discovery from nonstationary/heterogeneous data: Skeleton estimation and orientation determination[J]. IJCAI (US). 2017: 1347-1353. DOI: 10.24963/ijcai.2017/187.
    [43] Mooij JM, Magliacane S, Claassen T. Joint causal inference from multiple contexts[J]. J Mach Learn Res, 2020, 21: 1-99, 108. DOI: 10.5555/3455716.3455815.
    [44] Claassen T, Heskes T. Causal discovery in multiple models from different experiments[J]. Advances in Neural Information Processing Systems, 2010, 23: 415-423.
  • 加载中
图(4)
计量
  • 文章访问数:  275
  • HTML全文浏览量:  255
  • PDF下载量:  71
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-05-20
  • 修回日期:  2022-08-22
  • 刊出日期:  2022-10-10

目录

    /

    返回文章
    返回