Topic Group 8: Survival analysis

Chairs:	Michal Abrahamowicz, Malka Gorfine, Terry Therneau
Members:	Federico Ambrogi, Richard Cook, Pierre Joly, Per Kragh Andersen, Torben Martinussen, Maja Pohar-Perme, Hein Putter, Michael Schell, Jeremy Taylor

In a large proportion of observational studies, including prospective or retrospective longitudinal cohort studies, the outcome of primary interest is the time to the occurrence of a specific event or endpoint, such as death or hospitalization. Because often the events of interest are observable for only some study participants, specialized analytical methods are required to deal with ‘censored observations’, i.e. those subjects who had no event until the end of their follow-up. The development of statistical methods able to handle such censored time-to-event outcomes is the main focus of survival analysis, which is increasingly applied in longitudinal studies across a broad spectrum of empirical sciences. Whereas some methods developed for other types of outcomes, such as continuous, normally distributed or binary variables, can be easily adapted to the analyses of censored data, several important conceptual and analytical challenges are specific to survival analysis. Accordingly, this is a very active but also a rather specialized area of statistical research. Indeed, the end-users (empirical researchers and data analysts) are often either unaware of new survival analytical methods, typically published in statistical journals, or unable to understand why, when and how these methods should be implemented. As a result, a vast majority of real-life applications of survival analysis use only a few, very popular statistical methods, such as Kaplan-Meier curves, log rank test, or Cox proportional hazards (PH) model (Cox 1972¹), which is employed in >90% of the multivariable time-to-event analyses of clinical data (Altman et al 1995²). Yet many end-users may not understand the important assumptions on which such conventional methods rely and do not recognize the impact of violations of these assumptions, and most do not know what alternative, more robust methods can be employed in such cases.

TG8 attempts to help the understanding of the analytical issues, frequently encountered in real-life applications of survival analysis, and provide practical guidance regarding the validated methods and the user-friendly software that can be used to address these issues. To this end, we will draw on both earlier published reviews of the main issues and methods of survival analysis (e.g., Andersen et al 2012³, Clark et al 2003⁴, Clayton 1988⁵) and expertise of the TG8 members.

We will first review the issues encountered in a classic survival analysis, which focuses on time to a single event. Many of these analytical challenges arise from the dynamic nature of the processes dealt with in survival analysis, where both (i) the ‘predictors’ (e.g., risk or prognostic factors, exposures or treatments) and (ii) the way they may affect, or be associated with, the risk of the event of interest may change during the follow-up of the cohort (Grambsch and Therneau 1994⁶). These time-related changes require extending the conventional Cox PH model to allow modeling of, respectively, (i) time-varying covariates (Abrahamowicz et al 2012⁷) and (ii) time-dependent effects (Therneau and Grambsch 2000⁸). In some applications, different multivariable regression models, such as additive hazards (Martinussen and Scheike 2006⁹) or accelerated failure time model (popular in causal inference literature), may be also considered as alternatives to the PH model. Further challenges, which require more specialized techniques, are related to some frequent limitations of the way the outcome (time-to-event) is measured, with unknown causes of death being dealt through net or relative survival (Perme et al 2012¹⁰), and the lack of precision regarding the timing of the event calling for methods adapted for interval-censored data (Sun 2006¹¹, Joly et al 2012¹²). Additional issues, and more complex methods, are of relevance if the study involves analyses of recurrent events (e.g., consecutive hospitalizations or infections of the same subject) (Cook and Lawless 2007¹³) or joint modeling of time-to-event and longitudinal changes in a marker of disease progression or a surrogate outcome (Rizopoulos 2012¹⁴, Wang and Taylor 2001¹⁵). Further guidance will be provided regarding when and how the aforementioned single-event methods can be refined and extended to a more complex setting where the same subject may experience different types of events. The competing risks analyses offer a natural approach to deal with longitudinal studies with two or more mutually exclusive events, such as deaths due to different causes (Beyersmann et al 2012¹⁶), but many real-life studies do not exploit their potential and, instead, rely on simplified analyses with censoring of competing events or use heterogeneous composite endpoints. Finally, multi-state models offer a framework for extending the time-to-event methods to more comprehensive analyses of the longitudinal processes of e.g. disease progression, able to account for individual patients following different pathways between either mutually exclusive or consecutive health states (Andersen and Keiding 2002¹⁷).

Consistent with the general STRATOS approach, our overall aim is to explain the main concepts related to each of the aforementioned analytical challenges, and provide practical recommendations regarding the choice of methods that address specific challenges and statistical software that implements these methods. We will illustrate selected analytical issues and the applications of the proposed methods by concrete examples involving real-life and/or simulated data.

The next stage of the work on the more comprehensive recommendations for the analyses of observational studies will involve collaborations of TG8 with other Topic Groups, as censored survival data require more specialized methods to deal with such generic problems as the handling of missing data, modeling of the functional form for continuous variables, analysis of high dimensional data, or choice of the study design. On the other hand, developments in survival analysis have major implications for building prediction models and have complex links to causal inference.

References