The reproducibility crisis in Psychology rumbles on. For the uninitiated, this is the general brouhaha we’re having over how reliable published psychological research is. I wrote a piece on this in 2013, which now sounds a little complacent, and unnecessarily focussed on just one area of psychology, given the extent of the problems since uncovered in the way research is manufactured (or maybe not, see below). Anyway, in the last week or so there have been three interesting developments
Michael Inzlicht blogged his ruminations on the state of the field of social psychology, and they’re not rosy : “We erred, and we erred badly“, he writes. It is a profound testament to the depth of the current concerns about the reliability of psychology when such a senior scientist begins to doubt the reality of some of the phenomenon upon which he has built his career investigating.
As someone who has been doing research for nearly twenty years, I now can’t help but wonder if the topics I chose to study are in fact real and robust. Have I been chasing puffs of smoke for all these years?
But not everyone is worried. A team of Harvard A-listers, including Timothy Wilson and Daniel Gilbert, have released press release announcing a commentary on the “Reproducibility Project: Psychology”. This was an attempt to estimate the reliability of a large sample of phenomena from the psychology literature (Short introduction in Nature here). The paper from this project was picked as one of the most important of 2015 by the journal Science.
There project is a huge effort, which is open to multiple interpretations. The Harvard team’s press release is headlined “No evidence of a replicability crisis in psychological science” and claimed “reproducibility of psychological science is indistinguishable from 100%”, as well as calling from the project to put effort into repairing the damage done to the reputation of psychological research.
I’d link to the press release, but it looks like between me learning of it yesterday and coming to write about it today this material has been pulled from the internet. The commentary announced was due to be released on March the 4th, so we wait with baited breath for the good news about why we don’t need to worry about the reliability of psychology research. Come on boys, we need some good news.
UPDATE 3rd March: The website is back! No Evidence for a Replicability Crisis in Psychological Science. Commentary here, and response
…But whatever you do, optimally weight evidence
Speaking of the Reproducibility Project, Alexander Etz produced a great Bayesian reanalysis of the data from that project (possible because it is all open access, via the Open Science Framework). This take on the project is a great example of how open science allows people to more easily build on your results, as well as being a vital complement to the original report – not least because it stops you naively accepting any simple statistical report of the what the reproducibility project ‘means’ (e.g. “30% of studies do not replicate” etc). Etz and Joachim Vandekerckhove have now upgraded the analysis to a paper, which is available (open access, natch) in PLoS One : “A Bayesian Perspective on the Reproducibility Project: Psychology“. And their interpretation of the reliability of psychology, as informed by the reproducibility project?
Overall, 75% of studies gave qualitatively similar results in terms of the amount of evidence provided. However, the evidence was often weak …The majority of the studies (64%) did not provide strong evidence for either the null or the alternative hypothesis in either the original or the replication…We conclude that the apparent failure of the Reproducibility Project to replicate many target effects can be adequately explained by overestimation of effect sizes (or overestimation of evidence against the null hypothesis) due to small sample sizes and publication bias in the psychological literature
One thought on “3 salvoes in the reproducibility crisis”
“The majority of the studies (64%) did not provide strong evidence for either the null or the alternative hypothesis…”
GS: No kidding! Indeed, they provide no quantitative evidence whatsoever concerning the truth or falsity of the null since a p-value doesn’t give you p(truth of null|data) but, rather, p(data|truth of null)!