Voodoo II: this time it isn’t personal

More analysis problems with brain scanning research have come to light in a new study just released in Nature Neuroscience and expertly covered by the BPS Research Digest. It demonstrates that the common practice of using the same data set to identify an area of interest and then home in on this area to test further ideas can lead to misleading results.

This usually occurs when brain activation is compared between two conditions where participants are doing different tasks. A whole brain analysis looks for statistically significant differences at every point in the brain.

It’s very complete, but because of the large amount of data, but the data also contains a large amount of noise, so it’s hard to find areas which you can confidently say are more active in one condition than the other.

An alternative approach is to only look at activation in one area of the brain, perhaps an area where it is most likely to occur based on what we already know about how the brain works. This is called region of interest analysis (often done with the wonderfully named ‘MarsBaR‘ tool) and because the data set is much smaller, it is more likely to find a reliable difference.

However, some studies do a whole brain analysis to find likely areas, and then home in using region of analysis tools to examine them ‘more closely’. This ‘magnifying glass’ metaphor seems intuitive, but because your using the same data set to create and test hypothesis, it can be problematic.

It’s like shooting arrows randomly into a wall and then drawing a target around ones which landed together. Someone looking at wall afterwards might think the archer was a good shot, but this impression is caused by the after-the-event painting of the target, and the same problem could affect these brain imaging studies.

After the recent furore over the ‘voodoo correlations’ study, this new study is markedly more measured in its language and doesn’t list individual offenders.

Indeed, the ‘Voodoo Correlations in Social Neuroscience’ paper was actually retitled on publication to ‘Puzzlingly high correlations in fMRI studies of emotion, personality, and social cognition’, presumably to avoid stirring the pot any further.

However, this new study takes a similar tack, demonstrating through several careful simulations that ‘double dipping’ a data set is likely to distort the results just due to statistical problems.

From the BPS Research Digest:

Nikolaus Kriegeskorte and colleagues analysed all the fMRI studies published in Nature, Science, Nature Neuroscience, Neuron and Journal of Neuroscience, in 2008, and found that 42 per cent of these 134 papers were guilty of performing at least one non-independent selective analysis – what Kriegeskorte’s team dub “double dipping”.

This is the procedure, also condemned by the Voodoo paper, in which researchers first perform an all-over analysis to find a brain region(s) that responds to the condition of interest, before going on to test their hypothesis on data collected in just that brain region. The cardinal sin is that the same data are used in both stages.

A similarly flawed approach can be seen in brain imaging studies that claim to be able to discern a presented stimulus from patterns of activity recorded in a given brain area. These are the kind of studies that lead to “mind reading” headlines in the popular press. In this case, the alleged statistical crime is to use the same data for the training phase of pattern extraction and the subsequent hypothesis testing phase.

Link to BPS Research Digest on the fMRI analysis problems.
Link to PubMed entry for study.

4 thoughts on “Voodoo II: this time it isn’t personal”

Interesting stuff, but from the point of view of a survey researcher/statistician it seems surprising that this is only now being raised as an issue.
Naive post-hoc tests, “fishing expeditions” and using the same data for generating/confirming hypotheses are all well-established problems surely?

It’s very sad to me that “pop psychology” has taken such a foothold in American society that there are now literally storefront, strip mall locations where you can get MRIs performed, then read by–get this–social workers, who will then inform anxious parents whether their child is learning disabled, hyperactive, has a learning disorder, etc. This is why the information in the above article bears repeating, and bears repeating often, and loudly. As the article points out, the selective “reading” of such MRIs by unlicensed (in the fields of neurology/radiology) “professionals” is nothing more than shooting multiple arrows against a wall, drawing a circle around the arrows which land closest together, then calling that a “meaningful result”. And, unfortunately, too many well meaning parents, and too many uninformed patients, simply don’t know enough to tell the difference between meaningful results and “voodoo radiology”.

Oops..I mentioned the phrase “learning disabled” twice in one sentence. Maybe *I’m” learning disabled :-).

@Stephen:
I think this is an error that people will keep making, because it consistently biases results in favour of finding an effect. We are all more inclined to believe convincing results, sadly. 🙂
The Kriegeskorte paper goes some ways beyond this basic point however. For me, the most interesting point made was that using ‘orthogonal’ contrasts in place of independent data is inherently risky, since orthogonal contrast vectors do not ensure that the contrast as such is orthogonal (see in particular the supplement for the NN paper). There are some very nice simulations to back this up.