A Post By: Gary Ernest Davis
Roger Peng is an Associate Professor in the Department of Biostatistics at the Johns Hopkins Bloomberg School of Public Health. Together with colleagues he has been a strong and articulate voice for improving the reproducibility of scientific studies.
A basic issue is: can you trust the results of a data analysis? If someone else took the same data, and used appropriate analytic techniques, would they come to a consistent conclusion?
As you might imagine, this is important not only for scientific studies, but for business, and health informatics, and indeed, any area where decision making depends on analysis of data. Which is to say, these days, practically any sensible enterprise we can imagine!
An empirical educational issue is that as more people gain training and skills in data analysis, we need to identify, through experimentation, data analytic techniques and strategies that are reproducible and replicable by basic to intermediate data analysts.
By replicability we mean the likelihood that an independent study aimed at the same question will give a result that is consistent with the original study, and by reproducibility we mean the capacity to re-analyze given data to obtain a consistent result.
Evidence-based data analysis (EBDA) is an empirically-based approach to identifying what works in increasing replicability and reproducibility, and then recommending, and expecting, those methods, techniques and tools, to be used in data analysis.
- Peng, R. (2015). The reproducibility crisis in science: A statistical counterattack. Significance, 12(3), 30-32.
and other articles by Roger and colleagues:
- Fisher, A., Anderson, G. B., Peng, R., & Leek, J. (2014). A randomized trial in a massive online open course shows people don’t know what a statistically significant relationship looks like, but they can learn. PeerJ, 2, e589.
- Leek, J. T., & Peng, R. D. (2015). Opinion: Reproducible research can still be wrong: Adopting a prevention approach. Proceedings of the National Academy of Sciences, 112(6), 1645-1646.
- Leek, J. T., & Peng, R. D. (2015). Statistics: P values are just the tip of the iceberg. Nature, 520(7549), 612-612.
- Peng, R. D. (2009). Reproducible research and Biostatistics. Biostatistics, 10(3), 405-408.