Tuesday, May 26, 2026 · 9:00 AM – 10:00 AM
Add to calendarReproducibility Rounds
Stanford CTSA Program on Research Rigor & Reproducibility (SPORR), in collaboration with Columbia, Duke, Harvard, and Indiana, cordially invites you to the Reproducibility Rounds webinar titled: What Biomedicine Can Learn about Reproducibility from Social & Behavioral Research: The SCORE Project
Speaker: Brian Nosek, PhD
Brian Nosek is the founder and Executive Director of the Center for Open Science (COS) and a professor at the University of Virginia. COS has long been a leader in creating an infrastructure to foster open and reproducible science, as well as a pioneer in metascience scholarship, having conducted three major reproducibility projects – in Psychological Science (Science, 2015), Cancer Biology (eLife, 2021) and in Social and Behavioral Science (SCORE, Nature, 2026).
Brian’s research and interests are in understanding how people and systems produce values-misaligned behavior; to develop, implement, and evaluate solutions to align behavior with values; and, to improve research methods and culture to accelerate progress in science. For this work he has received honorary doctorates from the Universities of Ghent (2019) and Bristol (2022).
Abstract: SCORE, a collaboration of 865 researchers, is now released as three papers in Nature, six preprints, and a lot of data (https://cos.io/score/). SCORE examined repeatability of findings from the social-behavioral sciences and tested whether human and automated methods could predict replicability. A representative subset of 600 claims were available for repeatability tests: reproductions (same data, same analysis), robustness tests (same data, different analyses), and replications (same question, different data). For reproducibility, we could obtain data for only 24% of the 600 papers. Of the 143 papers (551 claims) assessed, we precisely reproduced 54%, and approximately reproduced 74%. We were much more likely to succeed if authors shared data and code, versus just data or if we had to reconstruct data from original sources. For robustness, 34% of reanalyses showed the same result within a narrow tolerance (+/- .05 Cohen’s d), and 57% with a wider tolerance (+/- .20). Limiting to statistical conclusions (p<.05?), 74% of reanalyses reached the same conclusion, 24% observed no effect, and 2% observed an opposing effect. For replicability, we tested findings from 164 papers and successfully replicated 49% of them with the common statistical significance criterion. Original studies had an average effect size of r = 0.25, replication studies r = 0.10. Best performing human methods achieved about 75% accuracy predicting replication outcomes. The talk will also discuss implications for researchers and institutions on ways to improve the credibility of research.
Event details are sourced from Stanford’s public events feed. Times shown in Pacific time.
When
Tuesday, May 26, 2026 · 9:00 AM – 10:00 AM