Power analysis of a typical psychology experiment

Understanding statistical power is essential if you want to avoid wasting your time in psychology. The power of an experiment is its sensitivity – the likelihood that, if the effect tested for is real, your experiment will be able to detect it.

Statistical power is determined by the type of statistical test you are doing, the number of people you test and the effect size. The effect size is, in turn, determined by the reliability of the thing you are measuring, and how much it is pushed around by whatever you are manipulating.

Since it is a common test, I’ve been doing a power analysis for a two-sample (two-sided) t-test, for small, medium and large effects (as conventionally defined). The results should worry you.


This graph shows you how many people you need in each group for your test to have 80% power (a standard desirable level of power – meaning that if your effect is real you’ve an 80% chance of detecting it).

Things to note:

  • even for a large (0.8) effect you need close to 30 people (total n = 60) to have 80% power
  • for a medium effect (0.5) this is more like 70 people (total n = 140)
  • the required sample size increases drammatically as effect size drops
  • for small effects, the sample required for 80% is around 400 in each group (total n = 800).

What this means is that if you don’t have a large effect, studies with between groups analysis and an n of less than 60 aren’t worth running. Even if you are studying a real phenomenon you aren’t using a statistical lens with enough sensitivity to be able to tell. You’ll get to the end and won’t know if the phenomenon you are looking for isn’t real or if you just got unlucky with who you tested.

Implications for anyone planning an experiment:

  • Is your effect very strong? If so, you may rely on a smaller sample (For illustrative purposes the effect size of male-female heigh difference is ~1.7, so large enough to detect with small sample. But if your effect is this obvious, why do you need an experiment?)
  • You really should prefer within-sample analysis, whenever possible (power analysis of this left as an exercise)
  • You can get away with smaller samples if you make your measure more reliable, or if you make your manipulation more impactful. Both of these will increase your effect size, the first by narrowing the variance within each group, the second by increasing the distance between them

Technical note: I did this cribbing code from Rob Kabacoff’s helpful page on power analysis. Code for the graph shown here is here. I use and recommend Rstudio.

Cross-posted from www.tomstafford.staff.shef.ac.uk where I irregularly blog things I think will be useful for undergraduate Psychology students.

Irregularities in Science

Olympus_CH2_microscope_1A paper in the high-profile journal Science has been alleged to be based on fraudulent data, with the PI calling for it to be retracted. The original paper purported to use survey data to show that people being asked about gay marriage changed their attitudes if they were asked the survey questions by someone who was gay themselves. That may still be true, but the work of a team that set out to replicate the original study seems to show that the data reported in that paper was never collected in the way reported, and at least partly fabricated.

The document containing these accusations is interesting for a number of reasons. It contains a detailed timeline showing how the authors were originally impressed with study and set out to replicate it, gradually uncovering more and more elements that concerned them and let them to investigate how the original data was generated. The document also reports the exemplary way in which they shared their concerns with the authors of the original paper, and the way the senior author responded. The speed of all this is notable – the investigators only started work on this paper in January, and did most of the analysis substantiating their concerns this month.

As we examined the study’s data in planning our own studies, two features surprised us: voters’ survey responses exhibit much higher test-retest reliabilities than we have observed in any other panel survey data, and the response and reinterview rates of the panel survey were significantly higher than we expected. We set aside our doubts about the study and awaited the launch of our pilot extension to see if we could manage the same parameters. LaCour and Green were both responsive to requests for advice about design details when queried.

So on the one hand this is a triumph for open science, and self-correction in scholarship. The irony being that any dishonesty that led to publication in a high-impact journal, also attracted people with the desire and smarts to check if what was reported holds up. But the tragedy is the circumstances that led the junior author of the original study, himself a graduate student at the time, to do what he did. No statement from him is available at this point, as far as I’m aware.

The original: When contact changes minds: An experiment on transmission of support for gay equality

The accusations and retraction request: Irregularities in LaCour (2014)

Sampling error’s more dangerous friend

CROSSAs the UK election results roll in, one of the big shocks is the discrepancy between the pre-election polls and the results. All the pollsters agreed that it would be incredibly close, and they were all wrong. What gives?

Some essential psych 101 concepts come in useful here. Polls rely on sampling – the basic idea being that you don’t have to ask everyone to get a rough idea of how things are going to go. How rough that idea is depends on how many you ask. This is the issue of sampling error. We understand sampling error – you can estimate it, so as well as reducing this error by taking larger samples there are also principled ways of working out when you’ve asked enough people to get a reliable estimate (which is why polls of a country with a population of 70 million can still be accurate with samples in the thousands).

But, as Tim Harford points out in in this excellent article on sampling problems big data, with every sample there are two sources of unreliability. Sampling error, as I’ve mentioned, but also sampling bias.

sampling error has a far more dangerous friend: sampling bias. Sampling error is when a randomly chosen sample doesn’t reflect the underlying population purely by chance; sampling bias is when the sample isn’t randomly chosen at all.

The problem with sample bias is that, when you don’t know the ground truth, there is no principled way of knowing if your sample is biased. If your sample has some systematic bias in it, you can make a reliable estimate (minimising sample error), but you are still left with the sample bias – a bias you don’t know how big it is until you find out the truth. That’s my guess at what happened with the UK election. The polls converged, minimising the error, but the bias remained – a ‘shy tory‘ effect where many voters were not admitting (or not aware) that they would end up voting for the Conservative party.

The exit polls predicted the real result with surprising accuracy not because they minimised sampling error, but because they avoided the sample bias. By asking the people who actually turned up to vote how they actually voted, their sample lacked the bias of the pre-election polls.

Trauma is more complex than we think

I’ve got an article in The Observer about how the official definition of trauma keeps changing and how the concept is discussed as if it were entirely intuitive and clear-cut, when it’s actually much more complex.

I’ve become fascinated by how the concept of ‘trauma’ is used in public debate about mental health and the tension that arises between the clinical and rhetorical meanings of trauma.

One unresolved issue, which tests mental health professionals to this day, is whether ‘traumatic’ should be defined in terms of events or reactions.

Some of the confusion arises when we talk about “being traumatised”. Let’s take a typically horrifying experience – being caught in a war zone as a civilian. This is often described as a traumatic experience, but we know that most people who experience the horrors of war won’t develop post-traumatic stress disorder or PTSD – the diagnosis designed to capture the modern meaning of trauma. Despite the fact that these sorts of awful experiences increase the chances of acquiring a range of mental health problems – depression is actually a more common outcome than PTSD – it is still the case that most people won’t develop them. Have you experienced trauma if you have no recognisable “scar in the psyche”? This is where the concept starts to become fuzzy.

We have the official diagnosis of posttraumatic stress disorder or PTSD but actually lots of mental health problems can appear after awful events, and yet there is no ‘posttraumatic depression’ or ‘posttraumatic social phobia’ diagnoses.

To be clear, it’s not that trauma doesn’t exist but that it’s less fully developed as a concept than people think and, as a result, often over-simplified during debates.

Full article at the link below.

Link to Observer article on the shifting sands of trauma.

Radical embodied cognition: an interview with Andrew Wilson

adw_headshot_squareThe computational approach is the orthodoxy in psychological science. We try and understand the mind using the metaphors of information processing and the storage and retrieval of representations. These ideas are so common that it is easy to forget that there is any alternative. Andrew Wilson is on a mission to remind us that there is an alternative – a radical, non-representational, non-information processing take on what cognition is.

I sent him a few questions by email. After he answered these, and some follow up questions, we’ve both edited and agreed on the result, which you can read below.


Q1. Is it fair to say you are at odds with lots of psychology, theoretically? Can you outline why?

Psychology wants to understand what causes our behaviour. Cognitive psychology explanations are that behaviour is caused by internal states of the mind (or brain, if you like). These states are called mental representations, and they are models/simulations of the world that we use to figure out what to do and when to do it.

Cognitive psychology thinks we have representations because it assumes we have very poor sensory access to the world, e.g. vision supposedly begins with a patchy 2D image projected onto the back of the eye. We need these models to literally fill in the gaps by making an educated guess (‘inference’) about what caused those visual sensations.

My approach is called radical embodied cognitive psychology; ‘radical’ just means ‘no representations’. It is based on the work of James J Gibson. He was a perceptual psychologist who demonstrated that there is actually rich perceptual information about the world, and that we use this information. This is why perception and action are so amazingly successful most of the time, which is important because failures of perception have serious consequences for your health and wellbeing (e.g. falling on ice)

The most important consequence of this discovery is that when we have access to this information, we don’t need those internal models anymore. This then means that whatever the brain is doing, it’s not building models of the world in order to cause our behaviour. We are embedded in our environments and our behaviour is caused by the nature of that embedding (specifically, which information variables we are using for any given task).

So I ask very different questions than the typical psychologist: instead of ‘what mental model lets me solve this task?’ I ask ‘what information is there to support the observed behaviour and can I find evidence that we use it?’. When we get the right answer to the information question, we have great success in explaining and then predicting behaviour, which is actually the goal of psychology.


Q2. The idea that there are no mental representations is hard to get your head around. What about situations where behaviour seems to be based on things which aren’t there, like imagination, illusions or predictions?

First, saying that there are no mental representations is not saying that the brain is not up to something. This is a surprisingly common mistake, but I think it’s due to the fact cognitive psychologists have come to equate ‘brain activity’ with ‘representing’ and denying the latter means denying the former (see Is Embodied Cognition a No-Brainer?).

Illusions simply reveal how important it is to perception that we can move and explore. They are all based on a trick and they almost always require an Evil Psychologist™ lurking in the background. Specifically, illusions artificially restrict access to information so that the world looks like it’s doing one thing when it is really doing another. They only work if you don’t let people do anything to reveal the trick. Most visual illusions are revealed as such by exploring them, e.g by looking at them from a different perspective (e.g. the Ames Room).

Imagination and prediction are harder to talk about in this framework, but only because no one’s really tried. For what it’s worth, people are terrible at actively predicting things, and whatever imagination is it will be a side-effect of our ability to engage with the real world, not part of how we engage with the real world.


Q3. Is this radical approach really denying the reality of cognitive representations, or just using a different descriptive language in which they don’t figure? In other words, can you and the cognitivists both be right?

If the radical hypothesis is right, then a lot of cognitive theories will be wrong. Those theories all assume that information comes into the brain, is processed by representations and then output as behaviour. If we successfully replace representations with information, all those theories will be telling the wrong story. ‘Interacting with information’ is a completely different job description for the brain than ‘building models of the world’. This is another reason why it’s ‘radical’.


Q4. Even if I concede that you can think of the mind like this, can you convince me that I should? Why is it useful? What does this approach do for cognitive science that the conventional approach isn’t or cant’?

There are two reasons, I think. The first is empirical; this approach works very, very well. Whenever a researcher works through a problem using this approach, they find robust answers that stand up to extended scrutiny in the lab. These solutions then make novel predictions that also perform well  – examples are topics like the outfielder problem and the A-not-B error [see below for references]. Cognitive psychology is filled with small, difficult to replicate effects; this is actually a hint that we aren’t asking the right questions. Radical embodied cognitive science tends to produce large, robust and interpretable effects which I take as a hint that our questions are closer to the mark.

The second is theoretical. The major problem with representations is that it’s not clear where they get their content from. Representations supposedly encode knowledge about the world that we use to make inferences to support perception, etc. But if we have such poor perceptual contact with the world that we need representations, how did we ever get access to the knowledge we needed to encode? This grounding problem is a disaster. Radical embodiment solves it by never creating it in the first place – we are in excellent perceptual contact with our environments, so there are no gaps for representations to fill, therefore no representations that need content.


Q5. Who should we be reading to get an idea of this approach?

‘Beyond the Brain’ by Louise Barrett. It’s accessible and full of great stuff.

‘Radical Embodied Cognitive Science’ by Tony Chemero. It’s clear and well written but it’s pitched at trained scientists more than the generally interested lay person.

‘Embodied Cognition’ by Lawrence Shapiro that clearly lays out all the various flavours of ‘embodied cognition’. My work is the ‘replacement’ hypothesis.

‘The Ecological Approach to Visual Perception’ by James J Gibson is an absolute masterpiece and the culmination of all his empirical and theoretical work.

I run a blog at http://psychsciencenotes.blogspot.co.uk/ with Sabrina Golonka where we discuss all this a lot, and we tweet @PsychScientists. We’ve also published a few papers on this, the most relevant of which is ‘Embodied Cognition is Not What You Think It Is


Q6. And finally, can you point us to a few blog posts you’re proudest of which illustrate this way of looking at the world

What Else Could It Be? (where Sabrina looks at the question, what if the brain is not a computer?)

Mirror neurons, or, What’s the matter with neuroscience? (how the traditional model can get you into trouble)

Prospective Control – The Outfielder problem (an example of the kind of research questions we ask)

The scientist as problem solver

97px-Herbert_simon_red_completeStart the week with one of the founding fathers of cognitive science: in ‘The scientist as problem solver‘, Herb Simon (1916-2001) gives a short retrospective of his scientific career.

To tell the story of the research he has done, he advances a thesis: “The Scientist is a problem solver. If the thesis is true, then we can dispense with a theory of scientific discovery – the processes of discovery are just applications of the processes of problem solving.”. Quite aside from the usefulness of this perspective, the paper is an reminder of intoxicating possibility of integration across the physical, biological and social sciences: Simon worked on economics, management theory, complex systems and artificial intelligence as well as what we’d call now cognitive psychology.

He uses his own work on designing problem solving algorithms to reflect on how he – and other scientists – can and should make scientific progress. Towards the end he expresses what would be regarded as heresy in many experimentally orientated psychology departments. He suggests that many of his most productive investigations lacked a contrast between experimental and control conditions. Did this mean they were worthless, he asks. No:

…You can test theoretical models without contrasting an experimental with a control condition. And apart from testing models, you can often make surprising observations that give you ideas for new or improved models…

Perhaps it is not our methodology that needs revising so much as the standard textbook methodology, which perversely warns us against running an experiment until precise hypotheses have been formulated and experimental and control conditions defined. How do such experiments ever create surprise – not just the all-too-common surprise of having our hypotheses refuted by facts, but the delight-provoking surprise of encountering a wholly unexpected phenomenon? Perhaps we need to add to the textbooks a chapter, or several chapters, describing how basic scientific discoveries can be made by observing the world intently, in the laboratory or outside it, with controls or without them, heavy with hypotheses or innocent of them.

Simon, H. A. (1989). The scientist as problem solver. Complex information processing: The impact of Herbert A. Simon, 375-398.

You can’t play 20 questions with nature and win

You can’t play 20 questions with nature and win” is the title of Allen Newell‘s 1973 paper, a classic in cognitive science. In the paper he confesses that although he sees many excellent psychology experiments, all making undeniable scientific contributions, he can’t imagine them cohering into progress for the field as a whole. He describes the state of psychology as focussed on individual phenomena – mental rotation, chunking in memory, subitizing, etc – studied in a way to resolve binary questions – issues such as nature vs nature, conscious vs unconscious, serial vs parallel processing.

There is, I submit, a view of the scientific endeavor that is implicit (and sometimes explicit) in the picture I have presented above. Science advances by playing twenty questions with nature. The proper tactic is to frame a general question, hopefully binary, that can be attacked experimentally. Having settled that bits-worth, one can proceed to the next. The policy appears optimal – one never risks much, there is feedback from nature at every step, and progress is inevitable. Unfortunately, the questions never seem to be really answered, the strategy does not seem to work.

As I considered the issues raised (single code versus multiple code, continuous versus discrete representation, etc.) I found myself conjuring up this model of the current scientific process in psychology- of phenomena to be explored and their explanation by essentially oppositional concepts. And I couldn’t convince myself that it would add up, even in thirty more years of trying, even if one had another 300 papers of similar, excellent ilk.

His diagnosis for one reason that phenomena can generate an endless excellent papers without endless progress is that people can do the same task in different ways. Lots of experiments dissect how people are doing the task, without constraining sufficiently the things Newell says are essential to predict behaviour (the person’s goals and the structure of the task environment), and thus providing no insight into the ultimate target of investigation, the invariant structure of the mind’s processing mechanisms. As a minimum, we must know the method participants are using, never averaging over different methods, he concludes. But this may not be enough:

That the same human subject can adopt many (radically different) methods for the same basic task, depending on goal, background knowledge, and minor details of payoff structure and task texture — all this — implies that the “normal” means of science may not suffice.

As a prognosis for how to make real progress in understanding the mind he proposes three possible courses of action:

  1. Develop complete processing models – i.e. simulations which are competent to perform the task and include a specification of the way in which different subfunctions (called ‘methods’ by Newell) are deployed.
  2. Analyse a complex task, completely, ‘to force studies into intimate relation with each other’, the idea being that giving a full account of a single task, any task, will force contradictions between theories of different aspects of the task into the open.
  3. ‘One program for many tasks’ – construct a general purpose system which can perform all mental tasks, in other words an artificial intelligence.

It was this last strategy which preoccupied a lot of Newell’s subsequent attention. He developed a general problem solving architecture he called SOAR, which he presented as a unified theory of cognition, and which he worked on until his death in 1992.

The paper is over forty years old, but still full of useful thoughts for anyone interested in the sciences of the mind.

Reference and link:
Newell, A. You can’t play 20 questions with nature and win: Projective comments on the papers of this symposium. in Chase, W. G. (Ed.). (1973). Visual Information Processing: Proceedings of the Eighth Annual Carnegie Symposium on Cognition, Held at the Carnegie-Mellon University, Pittsburgh, Pennsylvania, May 19, 1972. Academic Press.

See a nice picture of Newell from the Computer History Museum