2025 DSS Bootcamp
The authors informed the journal that the merge of lab results and other survey data used in the paper resulted in an error regarding the identification codes. Results of the analyses were based on the data set in which this error occurred. Further analyses established the results reported in this manuscript and interpretation of the data are not correct.
Original conclusion: Lower levels of CSF IL-6 were associated with current depression and with future depression […].
Revised conclusion: Higher levels of CSF IL-6 and IL-8 were associated with current depression […].
From the authors of Low Dose Lidocaine for Refractory Seizures in Preterm Neonates:
“The article has been retracted at the request of the authors. After carefully re-examining the data presented in the article, they identified that data of two different hospitals got terribly mixed. The published results cannot be reproduced in accordance with scientific and clinical correctness.”
The journal Heart has retracted a 2012 meta-analysis after learning that two of the six studies included in the review contained duplicated data. Those studies, it so happens, were conducted by one of the co-authors.
The Committee considered that without sight of the raw data on which the two papers containing the duplicate data were based, their reliability could not be substantiated. Following inquiries, it turns out that the raw data are no longer available having been lost as a result of computer failure.
Anil Potti was a rising star cancer researcher here at Duke in the School of Medicine. His lab’s research focused on precision oncology (genomic signatures that could be used to predict patient response to chemotherapy).
Work was published in high-profile journals and received significant funding and attention, clinical trails using some of these results were also conducted.
Early whistleblower complaints from a medical student were ignored
Extensive work by statisticians (Baggerly and Coombes) at MD Anderson Cancer Center showed strong evidence that data were fabricated, manipulated, and misrepresented.
Downfall (in part) was triggered by falsifications of Potti’s CV (claimed to have been a Rhodes Scholar)
Ultimately 11 papers were retracted
Shu, Mazar, Gino, Ariely, & Bazerman (2012), “Signing at the beginning makes ethics salient….” PNAS - was a very influential paper that claimed to have found that signing a declaration of honesty at the beginning of a survey reduced dishonest self-reports. It involved multiple independent studies and was widely cited in the literature.
The paper was retracted in 2021 due to concerns about data fabrication and manipulation in one of the studies.
Subsequently, additional issues were discovered in a different study by a different author within the same paper.
We need an environment where:
data, analysis, and results are tightly connected, or better yet, inseparable,
reproducibility is built in,
all procedures are human readable and understandable.
“Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.”
“The practitioner of literate programming […] strives for a program that is comprehensible because its concepts have been introduced in an order that is best for human understanding, using a mixture of formal and informal methods that reinforce each other.”
These ideas have been around for years!
Tools for putting them to practice have also been around.
They have never been as accessible as the current tools.
Scriptability
R / Python
Literate Programming
RMarkdown / Jupyter / Quarto
Version Control
Git / GitHub