Responsible research
and reproducibility

2025 DSS Bootcamp

Colin Rundel

Some case studies

Bad spreadsheet merge kills depression paper, quick fix resurrects it

The authors informed the journal that the merge of lab results and other survey data used in the paper resulted in an error regarding the identification codes. Results of the analyses were based on the data set in which this error occurred. Further analyses established the results reported in this manuscript and interpretation of the data are not correct.
Original conclusion: Lower levels of CSF IL-6 were associated with current depression and with future depression […].
Revised conclusion: Higher levels of CSF IL-6 and IL-8 were associated with current depression […].

Seizure study retracted after authors realize data got “terribly mixed”

From the authors of Low Dose Lidocaine for Refractory Seizures in Preterm Neonates:
“The article has been retracted at the request of the authors. After carefully re-examining the data presented in the article, they identified that data of two different hospitals got terribly mixed. The published results cannot be reproduced in accordance with scientific and clinical correctness.”

Heart pulls sodium meta-analysis over duplicated, and now missing, data

The journal Heart has retracted a 2012 meta-analysis after learning that two of the six studies included in the review contained duplicated data. Those studies, it so happens, were conducted by one of the co-authors.

The Committee considered that without sight of the raw data on which the two papers containing the duplicate data were based, their reliability could not be substantiated. Following inquiries, it turns out that the raw data are no longer available having been lost as a result of computer failure.

Potti case

Anil Potti was a rising star cancer researcher here at Duke in the School of Medicine. His lab’s research focused on precision oncology (genomic signatures that could be used to predict patient response to chemotherapy).

Work was published in high-profile journals and received significant funding and attention, clinical trails using some of these results were also conducted.
Early whistleblower complaints from a medical student were ignored
Extensive work by statisticians (Baggerly and Coombes) at MD Anderson Cancer Center showed strong evidence that data were fabricated, manipulated, and misrepresented.
Downfall (in part) was triggered by falsifications of Potti’s CV (claimed to have been a Rhodes Scholar)
Ultimately 11 papers were retracted

Clusterfake

Shu, Mazar, Gino, Ariely, & Bazerman (2012), “Signing at the beginning makes ethics salient….” PNAS - was a very influential paper that claimed to have found that signing a declaration of honesty at the beginning of a survey reduced dishonest self-reports. It involved multiple independent studies and was widely cited in the literature.

The paper was retracted in 2021 due to concerns about data fabrication and manipulation in one of the studies.

Subsequently, additional issues were discovered in a different study by a different author within the same paper.

Practice

Reproducibility in practice

Are the tables and figures reproducible from the code and data?
Does the code actually do what you think it does?
In addition to what was done, is it clear why it was done? (e.g., how were parameter settings chosen?)
Can the code be used for other data, especially future updates to the current data?
Can you extend the code to do other things?

Reproducibility in science

Ambitious goal

We need an environment where:

data, analysis, and results are tightly connected, or better yet, inseparable,
reproducibility is built in,
- the original data remains untouched
- all data manipulations and analyses are inherently documented
all procedures are human readable and understandable.

Donald Knuth “Literate Programming” (1983)

“Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.”

“The practitioner of literate programming […] strives for a program that is comprehensible because its concepts have been introduced in an order that is best for human understanding, using a mixture of formal and informal methods that reinforce each other.”

These ideas have been around for years!
Tools for putting them to practice have also been around.
They have never been as accessible as the current tools.

Reproducible data analysis stack

Scriptability

R / Python

Literate Programming

RMarkdown / Jupyter / Quarto

Version Control

Git / GitHub