STRATOS has had a series of short artciles in the Biometrical Bulletin since 2017. In the article from March 2023, Heinze et. al summarized STRATOS papers giving guidance for analysts with limited statistical knowledge.
Heinze G, Boulesteix AL, Dunkler D, Gail M, Lee KJ, van Calster B, Wallace M, Sauerbrei W (2023): STRengthening Analytical Thinking for Observational Studies (STRATOS): Guidance for analysts with limited statistical knowledge.
The following papers were cited:
Baillie, M., le Cessie, S., Schmidt, C. O., Lusa, L., & Huebner, M. (2022). Ten simple rules for initial data analysis. PLOS Computational Biology (Vol. 18, Issue 2, p. e1009819). https://doi.org/10.1371/journal.pcbi.1009819
Short summary
The paper "Ten Simple Rules for Initial Data Analysis" outlines a framework for researchers to analyze data responsibly. Initial data analysis (IDA) is the first step to check if the observed data corresponds to expectations. The phases of IDA include metadata setup, data cleaning, data screening, initial data reporting, refining and updating the research analysis plan, and documenting and reporting IDA. Researchers often do not perform IDA in a systematic way or mix IDA activities with subsequent data analysis tasks. Disciplined and systematic IDA practice can provide researchers with the necessary context about data properties and structures to avoid pitfalls. The paper provides 10 rules for adopting IDA in practice, including developing an a priori IDA plan, separating IDA from final data analysis, and publishing relevant research materials including metadata, code, and IDA reports.
The article introduces the concept of Initial Data Analysis (IDA), a framework for working with data responsibly. IDA has six phases including metadata setup, data cleaning, data screening, initial data reporting, refining and updating the research analysis plan, and documenting and reporting IDA. While often mistaken for exploratory data analysis (EDA), IDA aims to ensure transparency and integrity of preconditions to conduct appropriate statistical analyses to answer predefined research questions. The authors have developed ten rules to explain IDA, which are applicable to all researchers who analyze data, and caution against underestimating the challenge of an effective IDA, but also note the return on this investment. Following these rules can help ensure transparency and integrity in data analysis and research outputs, avoid pitfalls, and enable reliable reuse of data in future research.
Boulesteix, A.-L., Groenwold, R. H., Abrahamowicz, M., Binder, H., Briel, M., Hornung, R., Morris, T. P., Rahnenführer, J., & Sauerbrei, W. (2020). Introduction to statistical simulations in health research. BMJ Open (Vol. 10, Issue 12, p. e039921). https://doi.org/10.1136/bmjopen-2020-039921
Short summary
In health research, statistical methods are frequently used to address a wide variety of research questions. For almost every analytical challenge, different methods are available. But how do we choose between different methods and judge whether the chosen method is appropriate for our specific study? The objective of this paper is to demonstrate that simulation studies, that is, experiments investigating synthetic data with known properties, are an invaluable tool for addressing these questions. We provide an introduction to simulation studies for researchers involved in statistical analyses, who (1) may rely on simulation studies published in statistical literature to choose their statistical methods and who, thus, need to understand the criteria of assessing the validity and relevance of simulation results and their interpretation; and/or (2) need to understand the basic principles of designing statistical simulations in order to efficiently collaborate with more experienced colleagues or start learning to conduct own simulations.
Gail, M. H., Altman, D. G., Cadarette, S. M., Collins, G., Evans, S. J., Sekula, P., Williamson, E., & Woodward, M. (2019). Design choices for observational studies of the effect of exposure on disease incidence. BMJ Open (Vol. 9, Issue 12, p. e031031). https://doi.org/10.1136/bmjopen-2019-031031
Short summary
This paper compares different observational study designs that can be used to measure the association between an exposure and disease incidence. The paper discusses cohort studies, sub-samples from cohorts, and population-based or hospital-based case-control studies and compares their theoretical and practical advantages and disadvantages. If the study aim is to estimate not only associations but also absolute risks, cohort studies are required. Thus, choosing an appropriate design is crucial to achieving scientific objectives. The paper highlights how certain design features can reduce threats to study validity. The goal of the paper is to help readers choose the most appropriate study design, considering the study's scientific aims and practical constraints.
Lee, K. J., Tilling, K. M., Cornish, R. P., Little, R. J. A., Bell, M. L., Goetghebeur, E., Hogan, J. W., & Carpenter, J. R. (2021). Framework for the treatment and reporting of missing data in observational studies: The Treatment And Reporting of Missing data in Observational Studies framework. Journal of Clinical Epidemiology (Vol. 134, pp. 79–88). https://doi.org/10.1016/j.jclinepi.2021.01.008
Short summary
Missing data are common in medical research, and it is crucial to handle them appropriately to obtain valid inference. This paper presents a framework for handling and reporting missing data in observational studies. The framework has three steps: 1) developing an analysis plan including details on how missing data are going to be addressed, 2) examining the data checking the pre-planned methods are appropriate, and conduct the analysis, and 3) reporting the results from all analyses. They stress the importance of reporting all analyses, including results from a sensitivity analysis regarding the missingness mechanism. The framework is illustrated using a case study from the Avon Longitudinal Study of Parents and Children. It is hoped that using this framework will encourage researchers to thinking systematically about missing data and transparently report the effect on study results, therefore increasing the reproducibility of research findings.
Sauerbrei, W., Abrahamowicz, M., Altman, D. G., Cessie, S., & Carpenter, J. (2014). STRengthening Analytical Thinking for Observational Studies: the STRATOS initiative. Statistics in Medicine (Vol. 33, Issue 30, pp. 5413–5432). https://doi.org/10.1002/sim.6265
Short summary
The STRATOS initiative is a collaborative effort aimed at improving the validity and practical utility of observational medical research by providing accessible and accurate guidance in the design and analysis of such studies. Many applied researchers lack the guidance necessary to apply more sophisticated statistical methods, leading to serious weaknesses in study design and analysis. Even "standard" analyses reported in the medical literature are often flawed. To address this problem, the STRATOS initiative brings together experts in various areas of biostatistical research to develop guidance documents that are accessible to applied statisticians and other data analysts with varying levels of statistical education, experience, and interests. This article introduces the STRATOS initiative, outlines its main aims, and provides an overview of its planned approach and progress to date. The hope is that this initiative will improve the quality and reproducibility of observational studies and ultimately benefit patients and society.
The STRATOS initiative aims to provide accessible and accurate guidance for designing and analyzing observational studies. Observational medical research depends critically on good study design, data quality, appropriate statistical methods, and accurate interpretation of results. Unfortunately, many analyses are conducted by researchers with a relatively weak statistical background and limited experience in using statistical methodology and software, resulting in flawed analyses and casting doubt on their results and conclusions. To address this issue, STRATOS is a large collaboration of experts in biostatistical research that provides guidance documents to the research community to keep up with recent methodological developments.
Van Calster, B., McLernon, D. J., van Smeden, M., Wynants, L., & Steyerberg, E. W. (2019). Calibration: the Achilles heel of predictive analytics. BMC Medicine (Vol. 17, Issue 1). https://doi.org/10.1186/s12916-019-1466-7
Short summary
Background: The assessment of calibration performance of risk prediction models based on regression or more flexible machine learning algorithms receives little attention.
Main text: Herein, we argue that this needs to change immediately because poorly calibrated algorithms can be misleading and potentially harmful for clinical decision-making. We summarize how to avoid poor calibration at algorithm development and how to assess calibration at algorithm validation, emphasizing balance between model complexity and the available sample size. At external validation, calibration curves require sufficiently large samples. Algorithm updating should be considered for appropriate support of clinical practice.
Conclusion: Efforts are required to avoid poor calibration when developing prediction models, to evaluate calibration when validating models, and to update models when indicated. The ultimate aim is to optimize the utility of predictive analytics for shared decision-making and patient counseling.
Wallace, M. (2020). Analysis in an imperfect world. Significance (Vol. 17, Issue 1, pp. 14–19). https://doi.org/10.1111/j.1740-9713.2020.01353.x
Short summary
Published in the general-audience Significance magazine, Michael Wallace introduces the concept of measurement error in an accessible and engaging manner. The article explains what measurement error is, how it can arise, the problems it can cause, and some of the steps that can be taken to mitigate its effects.
Wallisch, C., Bach, P., Hafermann, L., Klein, N., Sauerbrei, W., Steyerberg, E. W., Heinze, G., & Rauch, G. (2022). Review of guidance papers on regression modeling in statistical series of medical journals. PLOS ONE (Vol. 17, Issue 1, p. e0262918). https://doi.org/10.1371/journal.pone.0262918
Short summary
Although regression models play a central role in the analysis of medical research projects, many misconceptions on various aspects of modeling leading to faulty analyses still exist. Many medical publications do not adequately reflect recent advances in statistical methodology and regression modeling, leading to a problem of knowledge transfer from statistical research to application. In response, some medical journals have published statistical tutorials and papers aiming to address this issue. This review assesses the current level of knowledge on regression modeling contained in such papers. The review identified 57 topic-relevant articles from 23 series and investigated 44 predefined aspects on regression modeling. The study found that most series covered general aspects of regression modeling, and logistic regression was the most frequently described regression type. Misconceptions or misleading recommendations were seldom, however, several gaps, such as addressing nonlinear effects of continuous predictors, model specification, and variable selection, were identified. The study recommends the development of statistical guidance to better support medical researchers in performing or interpreting regression analyses.
Wynants, L., van Smeden, M., McLernon, D. J., Timmerman, D., Steyerberg, E. W., & Van Calster, B. (2019). Three myths about risk thresholds for prediction models. BMC Medicine (Vol. 17, Issue 1). https://doi.org/10.1186/s12916-019-1425-3
Short summary
Clinical prediction models estimate a patient's risk of having a disease or experiencing an event. Defining a risk threshold for intervention is challenging and often done in an ad hoc way. Three common myths about risk thresholds can lead to inappropriate patient risk stratification: assuming that risk stratification is always better than a continuous risk estimate, assuming false positives and false negatives are equally costly, and assuming there is a universally optimal risk threshold. Presenting results for multiple risk thresholds can help. Using context-dependent risk thresholds can avoid inappropriate allocation (or non-allocation) of interventions and generate better clinical outcomes.