Topic Group 8: Survival analysis

Chairs: Michal Abrahamowicz, Per Kragh Andersen, Terry Therneau
Members: Richard Cook, Pierre Joly, Torben Martinussen, Maja Pohar-Perme, Jeremy Taylor

In a large proportion of observational studies, including prospective or retrospective longitudinal cohort studies, the outcome of primary interest is the time to the occurrence of a specific event or endpoint, such as death or hospitalization. Because often the events of interest are observable for only some study participants, specialized analytical methods are required to deal with ‘censored observations’, i.e. those subjects who had no event until the end of their follow-up. The development of statistical methods able to handle such censored time-to-event outcomes is the main focus of survival analysis, which is increasingly applied in longitudinal studies across a broad spectrum of empirical sciences. Whereas some methods developed for other types of outcomes, such as continuous, normally distributed or binary variables, can be easily adapted to the analyses of censored data, several important conceptual and analytical challenges are specific to survival analysis. Accordingly, this is a very active but also a rather specialized area of statistical research. Indeed, the end-users (empirical researchers and data analysts) are often either unaware of new survival analytical methods, typically published in statistical journals, or unable to understand why, when and how these methods should be implemented. As a result, a vast majority of real-life applications of survival analysis use only a few, very popular statistical methods, such as Kaplan-Meier curves, log rank test, or Cox proportional hazards (PH) model (Cox 19721), which is employed in >90% of the multivariable time-to-event analyses of clinical data (Altman et al 19952). Yet many end-users may not understand the important assumptions on which such conventional methods rely and do not recognize the impact of violations of these assumptions, and most do not know what alternative, more robust methods can be employed in such cases.

TG8 attempts to help the understanding of the analytical issues, frequently encountered in real-life applications of survival analysis, and provide practical guidance regarding the validated methods and the user-friendly software that can be used to address these issues. To this end, we will draw on both earlier published reviews of the main issues and methods of survival analysis (e.g., Andersen et al 20123, Clark et al 20034, Clayton 19885) and expertise of the TG8 members.

We will first review the issues encountered in a classic survival analysis, which focuses on time to a single event. Many of these analytical challenges arise from the dynamic nature of the processes dealt with in survival analysis, where both (i) the ‘predictors’ (e.g., risk or prognostic factors, exposures or treatments) and (ii) the way they may affect, or be associated with, the risk of the event of interest may change during the follow-up of the cohort (Grambsch and Therneau 19946). These time-related changes require extending the conventional Cox PH model to allow modeling of, respectively, (i) time-varying covariates (Abrahamowicz et al 20127) and (ii) time-dependent effects (Therneau and Grambsch 20008). In some applications, different multivariable regression models, such as additive hazards (Martinussen and Scheike 20069) or accelerated failure time model (popular in causal inference literature), may be also considered as alternatives to the PH model. Further challenges, which require more specialized techniques, are related to some frequent limitations of the way the outcome (time-to-event) is measured, with unknown causes of death being dealt through net or relative survival (Perme et al 201210), and the lack of precision regarding the timing of the event calling for methods adapted for interval-censored data (Sun 200611, Joly et al 201212). Additional issues, and more complex methods, are of relevance if the study involves analyses of recurrent events (e.g., consecutive hospitalizations or infections of the same subject) (Cook and Lawless 200713) or joint modeling of time-to-event and longitudinal changes in a marker of disease progression or a surrogate outcome (Rizopoulos 201214, Wang and Taylor 200115). Further guidance will be provided regarding when and how the aforementioned single-event methods can be refined and extended to a more complex setting where the same subject may experience different types of events. The competing risks analyses offer a natural approach to deal with longitudinal studies with two or more mutually exclusive events, such as deaths due to different causes (Beyersmann et al 201216), but many real-life studies do not exploit their potential and, instead, rely on simplified analyses with censoring of competing events or use heterogeneous composite endpoints. Finally, multi-state models offer a framework for extending the time-to-event methods to more comprehensive analyses of the longitudinal processes of e.g. disease progression, able to account for individual patients following different pathways between either mutually exclusive or consecutive health states (Andersen and Keiding 200217).

Consistent with the general STRATOS approach, our overall aim is to explain the main concepts related to each of the aforementioned analytical challenges, and provide practical recommendations regarding the choice of methods that address specific challenges and statistical software that implements these methods. We will illustrate selected analytical issues and the applications of the proposed methods by concrete examples involving real-life and/or simulated data.

The next stage of the work on the more comprehensive recommendations for the analyses of observational studies will involve collaborations of TG8 with other Topic Groups, as censored survival data require more specialized methods to deal with such generic problems as the handling of missing data, modeling of the functional form for continuous variables, analysis of high dimensional data, or choice of the study design. On the other hand, developments in survival analysis have major implications for building prediction models and have complex links to causal inference.



  1. Cox DR. Regression models and life tables (with discussion). J R Stat Soc (Ser B) 1972; 34(2): 187-220.
  2. Altman DG, De Stavola BL, Love SB, Stepniewska KA. Review of survival analyses published in cancer journals. Br J Cancer 1995; 72(2): 511–518.
  3. Andersen PK, Geskus RB, de Witte T, Putter H. Competing risks in epidemiology: possibilities and pitfalls. Int J Epidemiol 2012; 41(3): 861-70.
  4. Clark TG, Bradburn MJ, Love SB, Altman DG. Survival Analysis Part I: Basic concepts and first analyses. Br J Cancer 2003; 89: 232-238.
  5. Clayton D. The analysis of event history data: a review of progress and outstanding problems. Stat Med 1988; 7:819-841.
  6. Grambsch PM, Therneau TM. Proportional hazards tests and diagnostics based on weighted residuals. Biometrika 1994; 81: 515–526.
  7. Abrahamowicz M, Beauchamp M-E, Sylvestre M-P. Comparison of alternative models for linking drug exposure with adverse effects. Stat Med 2012; 31(11-12): 1014-1030.
  8. Therneau TM, Grambsch PM. Modeling Survival Data: Extending the Cox Model. Springer-Verlag: New York, 2000.
  9. Martinussen T, Scheike TH. Dynamic regression models for survival data. Springer: New York, 2006.
  10. Perme MP, Stare J, Esteve J. On estimation in relative survival. Biometrics 2012; 68(1): 113-120.
  11. Sun J. The statistical analysis of interval-censored failure time data. Springer-Verlag: New York, 2006.
  12. Joly P, Gerds TA, Qvist V, Commenges D, Keiding N. Estimating survival of dental fillings on the basis of interval-censored data and multi-state models. Stat Med 2012; 31(11-12): 1139-1149.
  13. Cook RJ, Lawless JF. The statistical analysis of recurrent events. Springer-Verlag: New York, 2007.
  14. Rizopoulos D. Joint models for longitudinal and time-to-event data: with applications in R. Chapman & Hall, 2012.
  15. Wang Y, Taylor JMG. Jointly modeling longitudinal and event time data with application to acquired immunodeficiency syndrome. J Am Stat Assoc 2001; 96(455): 895-905.
  16. Beyersmann J, Allignol A, Schumacher M. Competing risks and multistate models with R. Springer-Verlag: New York, 2012.
  17. Andersen PK, Keiding N. Multistate models for event-history analysis. Stat Meth Med Res 2002; 11(2): 91-115.