Topic Group 7: Causal inference

Chairs:	Els Goetghebeur, Ingeborg Waernbaum
Members:	Jack Bowden, Andrea Callegaro, Eric Jay Daza, Bianca De Stavola, Vanessa Didelez, Saskia le Cessie, Erica Moodie, Nan van Geloven, Michael Wallace, Martin Wolkewitz

Homepage: Topic Group 7

The desire to draw causal inference from observed associations is age old. The ensuing quest has contributed greatly to scientific progress. While simple association models have gradually gained in sophistication and their potential is typically well understood by practicing statisticians, causal questions and answers need an extra dimension of abstraction which calls for special care and caution. The move from association to causation is by no means trivial and requires assumptions not only about the observed data structure, but also beyond the sampled data. Notorious examples, such as the HRT story have taught us that lesson.

This topic group sets out to provide guidance on the sequence of steps involved in causal inference. This includes phrasing the causal question, designing a sampling frame and/or selecting the observational data, formulating assumptions to justify specific causal effect estimators, reporting results and (Daniel 2013 ¹) conducting sensitivity analyses for untestable assumptions. Several formalisms and schools of thought have been developed over the past decades that have deepened our insight, expanded the tool kit available and made the questions we can hope to address more ambitious.

Data selection and corresponding assumptions on the data structure will determine the specific causal parameter we set out to find. When the causal effect of a single treatment regime is envisaged one may attempt to mimick a randomized trial either by controlling for all necessary confounders or by relying on an instrumental variable. In either case, an estimator of the intention-to-treat effect, the per protocol effect or as-treated effect may follow. When more ambitiously we aim to estimate the effect of a dynamic treatment regime, or sequence of treatment decision rules in response to covariates evolving over time, longitudinal data are needed and a more stringent set of assumption involve no unmeasured time-varying confounders or the equivalent of sequential randomization. Adjusting for time-varying confounders which are at the same time intermediate variables on the causal path from exposure to outcome is a special challenge and best achieved through an inverse probability weighting technique.

To understand the causal structure and assumptions we are willing to impose on a data problem, causal diagrams (Pearl 1995²) and/or the formalism of potential outcomes can be very helpful. They can also point to estimators for the target parameter. Many different estimation techniques exist and the terminology includes besides causal graphs: augmented inverse probability weighting (with stabilized weights), doubly robust procedures, g-computation, marginal structural models, (robust) multiple imputation, propensity scores, principal stratification, and so forth. Some relevant references are: Daniel 2013 ¹, Pearl 1995², Fischer-Lapp 1999 ³, Gagne 2012⁴, Hernan 2000 ⁵, Hernan 2008 ⁶, Moodie 2007 ⁷, Rosenbaum 1983 ⁸, Sterne 2002⁹, Valeri 2003 ¹⁰, Vansteelandt 2011 ¹¹.

Drawing this together, the TG aims to advise on:

Classes of causal questions to consider with options of data structures
Corresponding analysis techniques, their dependence on (untestable) assumptions and available software options. Design recommendations.
Tools that help visualize and interpret assumptions, a basis for discussions with clinicians and study design considerations
Pros and cons of specific estimation approaches in terms of the bias/variance trade of, transparency and ease of implementation, robustness and back –up interpretation when assumptions for causal inference fail.
Pointers to tutorials and worked out case studies
Bridges between the jargon and terminology used by different schools of thought
Point to tools and tutorials for sensitivity analysis

Daniel RM, Cousens SN, Stavola BL de,Kenward MG, Sterne JAC. Methods for dealing with time-dependent confounding. Statistics in Medicine 2013; 32(9):1584–1618.
Pearl J. Causal diagrams for empirical research. (With discussion). Biometrika 1995; 82(4):669–710.
Fischer-Lapp K, Goetghebeur E. Practical properties of some structural mean analyses of the effect of compliance in randomized trials. Controlled Clinical Trials 1999; 20(6):531–546.
Gagne JJ, Polinski JM,Avorn J,Glynn RJ, Seeger JD. Standards for causal inference methods in analyses of data from observational and experimental studies in patient-centered outcomes research. 2012. For: Patient-Centered Outcome Research Institute Methodology Committee.
Hernan MA, Brumback B, Robins JM. Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men. Epidemiology 2000; 11(5):561–570.
Hernan MA, Alonso A, Logan R, Grodstein F, Michels KB, Willett WC, Manson JE, Robins JM. Observational studies analyzed like randomized experiments: an application to postmenopausal hormone therapy and coronary heart disease. Epidemiology 2008; 19(6):766–779.
Moodie EE, Richardson TS, Stephens DA. Demystifying optimal dynamic treatment regimes. Biometrics 2007; 63(2): 447–455.
Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika 1983; 70(1):41–55.
Sterne JTK. G-estimation of causal effects, allowing for time-varying confounding. The Stata Journal 2002; 2(2):164–182.
Valeri L, VanderWeele T. Mediation analysis allowing for exposure-mediator interactions and causal interpretation: theoretical assumptions and implementation with SAS and SPSS macros. Psychological Methods 2003; 18:164–182.
Vansteelandt S, Bowden J, Babanezhad MGE. On instrumental variables estimation of causal odds ratios. Statistical Science 2011; 26(3).