The book Reproducibility: Principles, Problems, Practices, and Prospects, which contains a chapter co-authored by the late Jonathan Borwein and the present authors (Victoria Stodden and David H. Bailey), has won a 2017 Prose Award (“Honorable Mention”) in the category “Textbook/Best in Physical Sciences and Mathematics.” These prizes are awarded annually in 53 categories by the Professional and Scholarly Publishing division of the Association of American Publishers.
This volume consists of 27 chapters, grouped into six sections, which collectively address questions of reproducibility in a broad range of scientific disciplines, ranging from medicine, physical sciences, life sciences, social sciences and even an article on reproducibility in literature studies. Statistical issues that arise in reproducibility studies are also discussed.
The book arose out of a growing consensus that reproducibility difficulties plague a surprising wide range of scientific disciplines. Here are just a few of recent cases that have attracted widespread publicity:
- In 2017 the Reproducibility Project was able to replicate only two of five key studies in cancer research.
- In 2015, in a separate study by the Reproducibility Project, only 39 of 100 psychology studies could be replicated.
- In 2015, a study by the U.S. Federal Reserve was able to reproduce only 29 of 67 economics studies.
- In 2014, backtest overfitting emerged as a major problem in computational finance.
- In 2012, Amgen researchers reported that they were able to reproduce fewer than 10 of 53 cancer studies.
Our particular article for this volume, “Facilitating reproducibility in scientific computing: Principles and practice” (preprint available here), addresses reproducibility in the context of mathematical and scientific computing. We observe that far from being immune from these difficulties, in fact the mathematical and scientific computing field has some serious problems:
- Very few published computational studies include full details of the algorithms, computer equipment, code and data used.
- The field is, if anything, quite laggard in developing an ethic of reproducibility, such as the need to carefully document all numerical experiments.
- In many cases, even the original authors of a computational study can no longer reproduce their published results — the actual codes used are no longer available or have been changed, etc.
- Few studies save their code and data on permanent, publicly accessible data repositories.
- Many studies employ questionable statistical methods or metrics.
- Many studies employ questionable methods to report performance.
- Numerical reproducibility has emerged has a significant issue, particularly on very large-scale computations that greatly magnify sensitivity to numerical round-off error.
Some may be disturbed at some of these developments, but the present authors actually regard efforts such as the publication of this volume to be a good sign. We see it as evidence that many fields, from mathematical computing to cancer research and climate modeling, are recognizing the need for improvement and are taking positive steps to correct these problems.
As a single example, recently the American Association for the Advancement of Science adopted new guidelines for papers submitted to its flagship journals, including Science. Under these new guidelines, authors are required to provide more background information on their work, and papers are subjected to a review of their statistical methods in addition to the discipline-specific review.
More than anything else, though, such changes may well promote a fundamental culture change in the way science is done. That will be progress indeed.