A key pillar of “the scientific method” is reproducibility, one way to prove another scientist’s experimental claims. If the experiment and its results can be reproduced, the validity of the work is considerably strengthened.
But scientific reproducibility is not as common or as easy as many non-scientists think. In a recent study of landmark papers in cancer research, for example, only 11 percent of the studies could be reproduced.
In another recent case, a graduate student failed to reproduce the results of a widely cited economic-policy paper – a failure which led to the exposure of significant, but unintentional, errors.
Hoping to quantify just what it takes to reproduce a scientific paper, researchers from three institutions conducted a study of a computational biology paper that analyzed tuberculosis-drug targets.
Philip Bourne, professor of pharmacology at the Skaggs School of Pharmacy and Pharmaceutical Sciences at the University of California San Diego, the principal investigator of the tuberculosis study and co-author of the paper; Daniel Garijo, a doctoral student from the Universidad Politecnica of Madrid; and Yolanda Gil, professor of computer science at the University of Southern California, collaborated to quantify “the difficulties of reproducibility” – and to suggest a possible solution.
Writing in the journal PLOS ONE, Gil and Garijo reported that they had to spend “significant time” reviewing materials from Bourne’s lab, and talking to previous lab members, to satisfactorily reconstruct the computational experiments of the original paper.
“We estimated the overall time to reproduce the method at 280 hours for a novice with minimal expertise in bioinformatics,” said Garijo, “either because computer scripts were not available, or there were assumptions in the described methods that would not be obvious to a non-expert.”
Failure to reproduce a study is rarely the result of fraud, said Bourne, but “mostly lack of a complete record.” In this case, he said, “it was not that the work could not be reproduced; the problem was that it took so much time – something all new graduate students in the lab can verify as they pick up previous students’ work.”
In this day and age, said Bourne, “We should really be doing better. It’s unfortunate to say this about my own work – but how many scientists could claim to be doing better?”
One way scientists might do better, said Gil, is to do what she and Garijo did. “As part of the reconstructive work,” she said, “we encoded the computational experiment in a semantic workflow, shared as a web object with annotations of its meanings.”
These workflow systems are now reaching such a level of maturity, say the researchers, that they’re likely to be adopted more broadly. “This should greatly facilitate reproducibility,” their report asserts.
Journals and their publishers can also encourage improved reproducibility by insisting that workflows, data, and software to be part of the submission-and-review process, the authors say.
Finally, they note, better reproducibility may eventually be mandated, citing a recent administration memorandum asking all agencies to develop policies to make results of all federally funded research broadly available to scientists, industry, and the public.