How the magic of cinema unlocked one man’s coma-bound world

Image from NIH. Click for source.An Alfred Hitchcock film helped to prove one patient had been conscious while in a coma-like state for 16 years. The discovery shows that neuroscience may still have lots to learn from the ancient art of storytelling, says Tom Stafford.

If brain injury steals your consciousness then you are in a coma: we all know that. What is less well known is that there exist neighbouring states to the coma, in which victims keep their eyes open, but show no signs of consciousness. The vegetative state, or ‘unresponsive wakefulness syndrome’, is one in which the patient may appear to be awake, and even goes to sleep at times, but otherwise shows no reaction to the world. Patients who do inconsistently respond, such as by flinching when their name is called, or following a bright object with their eyes, are classified as in a ‘minimally conscious state’. Both categories of patients show no signs of deliberate actions, or sustained reaction to the environment, and until recently there was no way for anyone to discern their true, inner, level of consciousness.

The fear is that, like the ‘locked-in syndrome’ that can occur after strokes, these patients may be conscious, but are just unable to show it. The opposite possibility is that these patients are as unconscious as someone in the deepest coma, with only circuitry peripheral to consciousness keeping their eyes open and producing minimal responses automatically.

In the last 10 years, research spearheaded by cognitive neuroscientist Adrian Owen has transformed our understanding of these shadowlands of consciousness. There is now evidence, obtained using brain scans, that some patients (around one in five) in these ‘wakeful coma’ states have conscious awareness. If asked to imagine playing tennis, the brain areas specifically controlling movement become active. If asked to imagine finding their way around their house, the brain regions involved in navigation become active. Using these signals a small minority of patients have even communicated with the outside world, with the brain scanner helping observers to mind-read their answers to questions.

The practical and ethical implications of these findings are huge, not least for the treatment of the hundreds of thousands of people who are in hospitals around the world in these conditions right now.

But the meaning of the research is still hotly debated. One issue is that the mind reading uses neural responses to questions or commands, and careful controls are needed to ensure that their patients’ brains aren’t just responding automatically without any actual conscious involvement. A second issue, and one that cannot be controlled away, is that the method used may tell us that these patients are capable of responding, but it doesn’t tell us much about the quality of conscious experience they are having. How alert, aware and focused they are is hard to discern.

In a relatively new study, a post-doctoral fellow at Owen’s lab, Lorina Naci, has used cinema to show just how sophisticated conscious awareness can be in a ‘minimally conscious’ patient.

The trick they used involved an 8 minute edit of “Bang! You’re dead”, a 1961 episode of “Alfred Hitchcock Presents”. In the film, a young boy with a toy gun obsession wanders around aiming and firing at people. Unbeknownst to him, and the adults he aims at, on this day he has found a real gun and it has a live bullet in the chamber.

The film works because of this hidden knowledge we, the viewers, have. Knowing about the bullet, a small boy’s mundane antics become high drama, as he unwittingly involves unsuspecting people in round after deadly round of Russian roulette.

Naci showed the film to healthy participants. To a separate group she showed a scrambled version involving rearranged one-second segments. This ‘control’ version was important because it contained many of the same features as the original; the same visual patterns, the same objects, the same actions. But it lacked the crucial narrative coherence – the knowledge of the bullet – which generated the suspense.

Using brain scanning, and the comparison of the two versions of the film, Naci and colleagues were able to show that the unscrambled, suspenseful version activated nearly every part of the cortex. Everything from primary sensory areas, to motor areas, to areas involved in memory and anticipation were engaged (as you might hope from a film from one of the masters of storytelling). The researchers were particularly interested in a network of activity that rose and fell in synchrony across ‘executive’ areas of the brain – those known to be involved in planning, anticipation, and integrating information from different sources. This network, they found, responded to the moments of highest suspense in the film; the moments when the boy was about to fire, for example. These were the moments you could only find so dramatic if you were following the plot.

Next the researchers showed the film to two patients in wakeful comas. In one, the auditory cortex became activated, but nothing beyond this primary sensory region. Their brain was responding to sounds, perhaps automatically, but there was no evidence of more complex processing. But in a second patient, who had been hospitalised and non-responsive for 16 years, his brain response matched those of the healthy controls who’d seen the film. Like them, activity across the cortex rose and fell with the action of the film, indicating an inner consciousness rich enough to follow the plot.

The astounding result should make us think carefully about how we treat such patients and marks an advance on the arsenal of techniques we can use to connect to the inner lives of non-responsive patients. It also shows how cognitive neuroscience can benefit from the use of more complex stimuli, such as movies, rather than the typically boring visual patterns and simple button-press responses that scientists usually use to probe the mysteries of the brain.

The genius of this research is that to test for the rich consciousness of the patient who appears unresponsive you need to use rich stimuli. The Hitchcock film was perfect because of its ability to create drama by what we believe and expect not because of what we merely see.

My BBC Future column from last week. The original is here. The original paper is: Naci, L., Cusack, R., Anello, M., Owen, A. M. A common neural code for similar conscious experiences in different individuals. PNAS. 2014;111(39):14277–82.

Images of ultra-thin models need your attention to make you feel bad

I have a guest post over at the BPS Research Digest, covering research on the psychological effects of pictures of ultra-thin fashion models.

A crucial question is whether the effect of these thin-ideal images is automatic. Does the comparison to the models, which is thought to be the key driver in their negative effects, happen without our intention, attention or both? Knowing the answer will tell us just how much power these images have, and also how best we might protect ourselves from them.

It’s a great study from the lab of Stephen Want (Ryerson University). For the full details of the research, head over: Images of ultra-thin models need your attention to make you feel bad

Update: Download the preprint of the paper, and the original data here

The reproducibility of psychological science

The Reproducibility Project results have just been published in Science, a massive, collaborative, ‘Open Science’ attempt to replicate 100 psychology experiments published in leading psychology journals. The results are sure to be widely debated – the biggest result being that many published results were not replicated. There’s an article in the New York Times about the study here: Many Psychology Findings Not as Strong as Claimed, Study Says

This is a landmark in meta-science : researchers collaborating to inspect how psychological science is carried out, how reliable it is, and what that means for how we should change what we do in the future. But, it is also an illustration of the process of Open Science. All the materials from the project, including the raw data and analysis code, can be downloaded from the OSF webpage. That means that if you have a question about the results, you can check it for yourself. So, by way of example, here’s a quick analysis I ran this morning: does the number of citations of a paper predict how large the effect size will be of a replication in the Reproducibility Project? Answer: not so much

cites_vs_effectR

That horizontal string of dots along the bottom is replications with close to zero-effect size, and high citations for the original paper (nearly all of which reported non-zero and statistically significant effects). Draw your own conclusions!

Link: Reproducibility OSF project page

Link: my code for making this graph (in python)

Intuitions about free will and the brain

Libet’s classifc experiment on the neuroscience of free will tells us more about our intuition than about our actual freedom

It is perhaps the most famous experiment in neuroscience. In 1983, Benjamin Libet sparked controversy with his demonstration that our sense of free will may be an illusion, a controversy that has only increased ever since.

Libet’s experiment has three vital components: a choice, a measure of brain activity and a clock.

The choice is to move either your left or right arm. In the original version of the experiment this is by flicking your wrist; in some versions of the experiment it is to raise your left or right finger. Libet’s participants were instructed to “let the urge [to move] appear on its own at any time without any pre-planning or concentration on when to act”. The precise time at which you move is recorded from the muscles of your arm.

The measure of brain activity is taken via electrodes on the scalp. When the electrodes are placed over the motor cortex (roughly along the middle of the head), a different electrical signal appears between right and left as you plan and execute a movement on either the left or right.

The clock is specially designed to allow participants to discern sub-second changes. This clock has a single dot, which travels around the face of the clock every 2.56 seconds. This means that by reporting position you are reporting time. If we assume you can report position accurately to 5 degree angle, that means you can use this clock to report time to within 36 milliseconds – that’s 36 thousandths of a second.

Putting these ingredients together, Libet took one extra vital measurement. He asked participants to report, using the clock, exactly the point when they made the decision to move.

Physiologists had known for decades that a fraction of a second before you actually move the electrical signals in your brain change. So it was in Libet’s experiment, a fraction of a second before participants moved, a reliable change could be recorded using the electrodes. But the explosive result was when participants reported deciding to move. This occurred in between the electric change in the brain and the actual movement. This means, as sure as cause follows effect, that the feeling of deciding couldn’t be a timely report of whatever was causing the movement. The electrode recording showed that the decision had – in some sense – already been made before the participants were aware of having taken action. The brain signals were changing before the subjective experience of taking a decision occurred.

Had participants’ brains already made the decision? Was the feeling of choosing just an illusion? Controversy has raged ever since. There is far more to the discussion about neuroscience and free will than this one experiment, but its simplicity has allowed it to capture the imagination of many who think our status as biological creatures limits our free will, as well as those who argue that free will survives the challenge of our minds being firmly grounded in our biological brains.

Part of the appeal of the Libet experiment is due to two pervasive intuitions we have about the mind. Without these intuitions the experiment doesn’t seem so surprising.

The first intuition is the feeling that our minds are a separate thing from our physical selves – a natural dualism that pushes us to believe that the mind is a pure, abstract place, free from biological constraints. A moment’s thought about the last time you were grumpy because you were hungry shatters this illusion, but I’d argue that it is still a persistent theme in our thinking. Why else would we be the least surprised that it is possible to find neural correlates of mental events? If we really believed, in our heart of hearts, that the mind is based in the brain, then we would know that every mental change must have a corresponding change in the brain.

The second pervasive intuition, which makes us surprised by the Libet experiment, is the belief that we know our own minds. This is the belief that our subjective experience of making decisions is an accurate report of how that decision is made. The mind is like a machine – as long as it runs right, we are happily ignorant of how it works. It is only when mistakes or contradictions arise that we’re drawn to look under the hood: Why didn’t I notice that exit? How could I forget that person’s name? Why does the feeling of deciding come after the brain changes associated with decision making?

There’s no reason to think that we are reliable reporters of every aspect of our minds. Psychology, in fact, gives us lots of examples of where we often get things wrong. The feeling of deciding in the Libet experiment may be a complete illusion – maybe the real decision really is made ‘by our brains’ somehow – or maybe it is just that the feeling of deciding is delayed from our actual deciding. Just because we erroneously report the timing of the decision, doesn’t mean we weren’t intimately involved in it, in whatever meaningful sense that can be.

More is written about the Libet experiment every year. It has spawned an academic industry investigating the neuroscience of free will. There are many criticisms and rebuttals, with debate raging about how and if the experiment is relevant to the freedom of our everyday choices. Even supporters of Libet have to admit that the situation used in the experiment may be too artificial to be a direct model of real everyday choices. But the basic experiment continues to inspire discussion and provoke new thoughts about the way our freedom is rooted in our brains. And that, I’d argue, is due to the way it helps us confront our intuitions about the way the mind works, and to see that things are more complex than we instinctively imagine.

This is my latest column for BBC Future. The original is here. You may also enjoy this recent post on mindhacks.com Critical strategies for free will experiments

Critical strategies for free will experiments

waveBenjamin Libet’s experiment on the neuroscience of free will needs little introduction. (If you do need an introduction, it’s the topic of my latest column for BBC Future). His reports that the subjective feeling of making a choice only come after the brain signals indicating a choice has been made are famous, and have produced controversy ever since they were published in the 1980s.

For a simple experiment, Libet’s paradigm admits to a large number of interpretations, which I think is an important lesson. Here are some common, and less common, critiques of the experiment:

The Disconnect Criticism

The choice required from Libet’s participants was trivial and inconsequential. Moreover, they were specifically told to make the choice without any reason (“let the urge [to move] appear on its own at any time without any pre-planning or concentration on when to act”). A common criticism is that this kind of choice has little to tell us about everyday choices which are considered, consequential or which are actively trying to involve ourselves in.

The timing criticism(s)

Dennett discusses how the original interpretation of the experiment assumes that the choosing self exists at a particular point and at particular time – so, for example, maybe in some central ‘Cartesian Theatre’ in which information from motor cortex and visual cortex come together, but crucially, does not have direct report of (say) the information about timing gathered by the visual cortex. Even in a freely choosing self, there will be timing delays as information on the clock time is ‘connected up’ with information on when the movement decision was made. These, Dennett argues, could produce the result Libet saw without indicating a fatal compromise for free choice.

My spin on this is that the Libet result shows, minimally, that we don’t accurately know the timing of our decisions, but inaccurate judgements about the timing of decisions doesn’t mean that we don’t actually make the decisions themselves that are consequential.

Spontaneous activity

Aaron Schurger and colleagues have a nice paper in which they argue that Libet’s results can be explained by variations in spontaneous activity before actions are taken. They argue that the movement system is constantly experiencing sub-threshold variation in activity, so that at any particular point in time you are more or less close to performing any particular act. Participants in the Libet paradigm, asked to make a spontaneous act, take advantage of this variability – effectively lowering their threshold for action and waiting until the covert fluctuations are large enough to trigger a movement. Importantly, this reading weakens the link between the ‘onset’ of movements and the delayed subjective experience of making a movement. If the movement is triggered by random fluctuations (observable in the rise of the electrode signal) then there isn’t a distinct ‘decision to act’ in the motor system, so we can’t say that the subjective decision to act reliably comes afterwards.

The ‘only deterministic on average’ criticism

The specific electrode signal which is used to time the decision to move in the brain is called the readiness potential (RP). Electrode readings are highly variable, so the onset of the RP is a statistical artefact, produced by averaging over many trials (40 in Libet’s case). This means we lose the ability to detect, trial-by-trial, the relation between the brain activity related to movement and the subjective experience. Libet reports this in his original paper [1] (‘only the average RP for the whole series could be meaningfully recorded’, p634). On occasion the subjective decision time (which Libet calls W) comes before the time of even the average RP, not after (p635: “instances in which individual W time preceded onset time of averaged RP numbered zero in 26 series [out of 36] – which means that 28% of series saw at least one instance of W occurring before the RP).

The experiment showed strong reliability, but not complete reliability (the difference is described by Libet as ‘generally’ occurring and as being ‘fairly consistent’, p636). What happened next to Libet’s result is a common trick of psychologists. A statistical pattern is discovered and then reality is described as if the pattern is the complete description: “The brain change occurs before the choice”.

Although such generalities are very useful, they are misleading if we forget that they are only averagely true, not always true. I don’t think Libet’s experiment would have the imaginative hold if the result was summarised as “The brain change usually occurs before the choice”.

A consistent, but not universal, pattern in the brain before a choice has the flavour of a prediction, rather than a compulsion. Sure, before we make a choice there are antecedents in the brain – it would be weird if there weren’t – but since these don’t have any necessary consequence for what we choose, so what?

To my mind the demonstration that you can use fMRI to reproduce the Libet effect but with brain signals changing up to 10 seconds before the movement (and an above chance accuracy at predicting the movement made), only reinforces this point. We all believe that the mind has something to do with the brain, so finding patterns in the brain at one point which predict actions in the mind at a later point isn’t surprising. The fMRI result, and perhaps Libet’s experiment, rely as much on our false intuition about dualism as conclusively demonstrating anything new about freewill.

Link: my column Why do we intuitively believe we have free will?

Laughter as a window on the infant mind

What makes a baby laugh? The answer might reveal a lot about the making of our minds, says Tom Stafford.

What makes babies laugh? It sounds like one of the most fun questions a researcher could investigate, but there’s a serious scientific reason why Caspar Addyman wants to find out.

He’s not the first to ask this question. Darwin studied laughter in his infant son, and Freud formed a theory that our tendency to laugh originates in a sense of superiority. So we take pleasure at seeing another’s suffering – slapstick style pratfalls and accidents being good examples – because it isn’t us.

The great psychologist of human development, Jean Piaget, thought that babies’ laughter could be used to see into their minds. If you laugh, you must ‘get the joke’ to some degree – a good joke is balanced in between being completely unexpected and confusing and being predictable and boring. Studying when babies laugh might therefore be a great way of gaining insight into how they understand the world, he reasoned. But although he proposed this in the 1940s, this idea remains to be properly tested. Despite the fact that some very famous investigators have studied the topic, it has been neglected by modern psychology.

Addyman, of Birkbeck, University of London, is out to change that. He believes we can use laughter to get at exactly how infants understand the world. He’s completed the world’s largest and most comprehensive survey of what makes babies laugh, presenting his initial results at the International Conference on Infant Studies, Berlin, last year. Via his website he surveyed more than 1000 parents from around the world, asking them questions about when, where and why their babies laugh.The results are – like the research topic – heart-warming. A baby’s first smile comes at about six weeks, their first laugh at about three and a half months (although some took three times as long to laugh, so don’t worry if your baby hasn’t cracked its first cackle just yet). Peekaboo is a sure-fire favourite for making babies laugh (for a variety of reasons I’ve written about here), but tickling is the single most reported reason that babies laugh.

Importantly, from the very first chuckle, the survey responses show that babies are laughing with other people, and at what they do. The mere physical sensation of something being ticklish isn’t enough. Nor is it enough to see something disappear or appear suddenly. It’s only funny when an adult makes these things happen for the baby. This shows that way before babies walk, or talk, they – and their laughter – are social. If you tickle a baby they apparently laugh because you are tickling them, not just because they are being tickled.

What’s more, babies don’t tend to laugh at people falling over. They are far more likely to laugh when they fall over, rather than someone else, or when other people are happy, rather than when they are sad or unpleasantly surprised. From these results, Freud’s theory (which, in any case, was developed based on clinical interviews with adults, rather than any rigorous formal study of actual children) – looks dead wrong.

Although parents report that boy babies laugh slightly more than girl babies, both genders find mummy and daddy equally funny.

Addyman continues to collect data, and hopes that as the results become clearer he’ll be able to use his analysis to show how laughter tracks babies’ developing understanding of the world – how surprise gives way to anticipation, for example, as their ability to remember objects comes online.

Despite the scientific potential, baby laughter is, as a research topic, “strangely neglected”, according to Addyman. Part of the reason is the difficulty of making babies laugh reliably in the lab, although he plans to tackle this in the next phase of the project. But partly the topic has been neglected, he says, because it isn’t viewed as a subject for ‘proper’ science to look into. This is a prejudice Addyman hopes to overturn – for him, the study of laughter is certainly no joke.

This is my BBC Future column from Tuesday. The original is here. If you are a parent you can contribute to the science of how babies develop at Dr Addyman’s babylaughter.net (specialising in laughter) or at babylovesscience.com (which covers humour as well as other topics).

Are online experiment participants paying attention?

factoryOnline testing is sure to play a large part in the future of Psychology. Using Mechanical Turk or other crowdsourcing sites for research, psychologists can quickly and easily gather data for any study where the responses can be provided online. One concern, however, is that online samples may be less motivated to pay attention to the tasks they are participating in. Not only is nobody watching how they do these online experiments, they whole experience is framed as a work-for-cash gig, so there is pressure to complete any activity as quickly and with as low effort as possible. To the extent that the online participants are satisficing or skimping on their attention, can we trust the data?

A newly submitted paper uses data from the Many Labs 3 project, which recruited over 3000 participants from both online and University campus samples, to test the idea that online samples are different from the traditional offline samples used by academic psychologists:

The findings strike a note of optimism, if you’re into online testing (perhaps less so if you use traditional university samples):

Mechanical Turk workers report paying more attention and exerting more effort than undergraduate students. Mechanical Turk workers were also more likely to pass an instructional manipulation check than undergraduate students. Based on these results, it appears that concerns over participant inattentiveness may be more applicable to samples recruited from traditional university participant pools than from Mechanical Turk

This fits with previous reports showing high consistency when classic effects are tested online, and with reports that satisficing may have been very high in offline samples, we just weren’t testing for it.

However, an issue I haven’t seen discussed is whether, because of the relatively small pool of participants taking experiments on MTurk, online participants have an opportunity to get familiar with typical instructional manipulation checks (AKA ‘catch questions’, which are designed to check if you are paying attention). If online participants adapt to our manipulation checks, then the very experiments which set out to test if they are paying more attention may not be reliable.

Link: new paperGraduating from Undergrads: Are Mechanical Turk Workers More Attentive than Undergraduate Participants?

This paper provides a useful overview: Conducting perception research over the internet: a tutorial review

Conspiracy theory as character flaw

NatureBrainPhilosophy professor Quassim Cassam has a piece in Aeon arguing that conspiracy theorists should be understood in terms of the intellectual vices. It is a dead-end, he says, to try to understand the reasons someone gives for believing a conspiracy theory. Consider someone called Oliver who believes that 9/11 was an inside job:

Usually, when philosophers try to explain why someone believes things (weird or otherwise), they focus on that person’s reasons rather than their character traits. On this view, the way to explain why Oliver believes that 9/11 was an inside job is to identify his reasons for believing this, and the person who is in the best position to tell you his reasons is Oliver. When you explain Oliver’s belief by giving his reasons, you are giving a ‘rationalising explanation’ of his belief.

The problem with this is that rationalising explanations take you only so far. If you ask Oliver why he believes 9/11 was an inside job he will, of course, be only too pleased to give you his reasons: it had to be an inside job, he insists, because aircraft impacts couldn’t have brought down the towers. He is wrong about that, but at any rate that’s his story and he is sticking to it. What he has done, in effect, is to explain one of his questionable beliefs by reference to another no less questionable belief.

So the problem is not their beliefs as such, but why the person came to have the whole set of (misguided) beliefs in the first place. The way to understand conspiracists is in terms of their intellectual character, Cassam argues, the vices and virtues that guide as us as thinking beings.

A problem with this account is that – looking at the current evidence – character flaws don’t seem that strong a predictor of conspiracist beliefs. The contrast is with the factors that have demonstrable influence on people’s unusual beliefs. For example, we know that social influence and common cognitive biases have a large, and measurable, effect on what we believe. The evidence isn’t so good on how intellectual character traits such as closed/open-mindedness, skepticism/gullibility are constituted and might affect conspiracist beliefs. That could be because the personality/character trait approach is inherently limited, or just that there is more work to do. One thing is certain, whatever the intellectual vices are that lead to conspiracy theory beliefs, they are not uncommon. One study suggested that 50% of the public endorse at least one conspiracy theory.

Link : Bad Thinkers by Quassim Cassam

Paper on personality and conspiracy theories: Unanswered questions: A preliminary investigation of personality and individual difference predictors of 9/11 conspiracist beliefs

Paper on widespread endorsement of conspiracy theories: Conspiracy Theories and the Paranoid Style(s) of Mass Opinion

Previously on Mindhacks.com That’s what they want you to believe

And a side note, this view that the problem with conspiracy theorists isn’t the beliefs helps explain why throwing facts at them doesn’t help, better to highlight the fallacies in how they are thinking.

For argument’s sake

ebook cover
I have (self) published an ebook For argument’s sake: evidence that reason can change minds. It is the collection of two essays that were originally published on Contributoria and The Conversation. I have revised and expanded these, and added a guide to further reading on the topic. There are bespoke illustrations inspired by Goya (of owls), and I’ve added an introduction about why I think psychologists and journalists both love stories that we’re irrational creatures incapable of responding to reasoned argument. Here’s something from the book description:

Are we irrational creatures, swayed by emotion and entrenched biases? Modern psychology and neuroscience are often reported as showing that we can’t overcome our prejudices and selfish motivations. Challenging this view, cognitive scientist Tom Stafford looks at the actual evidence. Re-analysing classic experiments on persuasion, as well as summarising more recent research into how arguments change minds, he shows why persuasion by reason alone can be a powerful force.

All in, it’s close to 7000 words and available from Amazon and Smashwords now

Phantasmagoric neural net visions

dreaming neural network imageA starling galley of phantasmagoric images generated by a neural network technique has been released. The images were made by some computer scientists associated with Google who had been using neural networks to classify objects in images. They discovered that by using the neural networks “in reverse” they could elicit visualisations of the representations that the networks had developed over training.

These pictures are freaky because they look sort of like the things the network had been trained to classify, but without the coherence of real-world scenes. In fact, the researchers impose a local coherence on the images (so that neighbouring pixels do similar work in the image) but put no restraint on what is globally represented.

The obvious parallel is to images from dreams or other altered states – situations where ‘low level’ constraints in our vision are obviously still operating, but the high-level constraints – the kind of thing that tries to impose an abstract and unitary coherence on what we see – is loosened. In these situations we get to observe something that reflects our own processes as much as what is out there in the world.

Link: The researchers talk about their ‘dreaming neural networks’
Gallery: Inceptionism: Going deeper into Neural Networks

Power analysis of a typical psychology experiment

Understanding statistical power is essential if you want to avoid wasting your time in psychology. The power of an experiment is its sensitivity – the likelihood that, if the effect tested for is real, your experiment will be able to detect it.

Statistical power is determined by the type of statistical test you are doing, the number of people you test and the effect size. The effect size is, in turn, determined by the reliability of the thing you are measuring, and how much it is pushed around by whatever you are manipulating.

Since it is a common test, I’ve been doing a power analysis for a two-sample (two-sided) t-test, for small, medium and large effects (as conventionally defined). The results should worry you.

power_analysis2

This graph shows you how many people you need in each group for your test to have 80% power (a standard desirable level of power – meaning that if your effect is real you’ve an 80% chance of detecting it).

Things to note:

  • even for a large (0.8) effect you need close to 30 people (total n = 60) to have 80% power
  • for a medium effect (0.5) this is more like 70 people (total n = 140)
  • the required sample size increases drammatically as effect size drops
  • for small effects, the sample required for 80% is around 400 in each group (total n = 800).

What this means is that if you don’t have a large effect, studies with between groups analysis and an n of less than 60 aren’t worth running. Even if you are studying a real phenomenon you aren’t using a statistical lens with enough sensitivity to be able to tell. You’ll get to the end and won’t know if the phenomenon you are looking for isn’t real or if you just got unlucky with who you tested.

Implications for anyone planning an experiment:

  • Is your effect very strong? If so, you may rely on a smaller sample (For illustrative purposes the effect size of male-female heigh difference is ~1.7, so large enough to detect with small sample. But if your effect is this obvious, why do you need an experiment?)
  • You really should prefer within-sample analysis, whenever possible (power analysis of this left as an exercise)
  • You can get away with smaller samples if you make your measure more reliable, or if you make your manipulation more impactful. Both of these will increase your effect size, the first by narrowing the variance within each group, the second by increasing the distance between them

Technical note: I did this cribbing code from Rob Kabacoff’s helpful page on power analysis. Code for the graph shown here is here. I use and recommend Rstudio.

Cross-posted from www.tomstafford.staff.shef.ac.uk where I irregularly blog things I think will be useful for undergraduate Psychology students.

Irregularities in Science

Olympus_CH2_microscope_1A paper in the high-profile journal Science has been alleged to be based on fraudulent data, with the PI calling for it to be retracted. The original paper purported to use survey data to show that people being asked about gay marriage changed their attitudes if they were asked the survey questions by someone who was gay themselves. That may still be true, but the work of a team that set out to replicate the original study seems to show that the data reported in that paper was never collected in the way reported, and at least partly fabricated.

The document containing these accusations is interesting for a number of reasons. It contains a detailed timeline showing how the authors were originally impressed with study and set out to replicate it, gradually uncovering more and more elements that concerned them and let them to investigate how the original data was generated. The document also reports the exemplary way in which they shared their concerns with the authors of the original paper, and the way the senior author responded. The speed of all this is notable – the investigators only started work on this paper in January, and did most of the analysis substantiating their concerns this month.

As we examined the study’s data in planning our own studies, two features surprised us: voters’ survey responses exhibit much higher test-retest reliabilities than we have observed in any other panel survey data, and the response and reinterview rates of the panel survey were significantly higher than we expected. We set aside our doubts about the study and awaited the launch of our pilot extension to see if we could manage the same parameters. LaCour and Green were both responsive to requests for advice about design details when queried.

So on the one hand this is a triumph for open science, and self-correction in scholarship. The irony being that any dishonesty that led to publication in a high-impact journal, also attracted people with the desire and smarts to check if what was reported holds up. But the tragedy is the circumstances that led the junior author of the original study, himself a graduate student at the time, to do what he did. No statement from him is available at this point, as far as I’m aware.

The original: When contact changes minds: An experiment on transmission of support for gay equality

The accusations and retraction request: Irregularities in LaCour (2014)

Sampling error’s more dangerous friend

CROSSAs the UK election results roll in, one of the big shocks is the discrepancy between the pre-election polls and the results. All the pollsters agreed that it would be incredibly close, and they were all wrong. What gives?

Some essential psych 101 concepts come in useful here. Polls rely on sampling – the basic idea being that you don’t have to ask everyone to get a rough idea of how things are going to go. How rough that idea is depends on how many you ask. This is the issue of sampling error. We understand sampling error – you can estimate it, so as well as reducing this error by taking larger samples there are also principled ways of working out when you’ve asked enough people to get a reliable estimate (which is why polls of a country with a population of 70 million can still be accurate with samples in the thousands).

But, as Tim Harford points out in in this excellent article on sampling problems big data, with every sample there are two sources of unreliability. Sampling error, as I’ve mentioned, but also sampling bias.

sampling error has a far more dangerous friend: sampling bias. Sampling error is when a randomly chosen sample doesn’t reflect the underlying population purely by chance; sampling bias is when the sample isn’t randomly chosen at all.

The problem with sample bias is that, when you don’t know the ground truth, there is no principled way of knowing if your sample is biased. If your sample has some systematic bias in it, you can make a reliable estimate (minimising sample error), but you are still left with the sample bias – a bias you don’t know how big it is until you find out the truth. That’s my guess at what happened with the UK election. The polls converged, minimising the error, but the bias remained – a ‘shy tory‘ effect where many voters were not admitting (or not aware) that they would end up voting for the Conservative party.

The exit polls predicted the real result with surprising accuracy not because they minimised sampling error, but because they avoided the sample bias. By asking the people who actually turned up to vote how they actually voted, their sample lacked the bias of the pre-election polls.

When society isn’t judging, women’s sex drive rivals men’s

Men just want sex more than women. I’m sure you’ve heard that one. Stephen Fry even went as far as suggesting in 2010 that straight women only went to bed with men because sex was “the price they are willing to pay for a relationship”.

Or perhaps you’ve even heard some of the evidence. In 1978 two psychologists, Russell Clark and Elaine Hatfield, did what became a famous experiment on the topic – not least because it demonstrated how much fun you can have as a social psychologist. Using volunteers, Clark and Hatfield had students at Florida State University approach people on campus and deliver a pick-up line.

The volunteers always began the same “I’ve noticed you around campus. I find you to be very attractive”, they said. They varied what they said next according to one of three randomly chosen options. Either “would you go out tonight?”, or “will you come over to my apartment?”, or “would you go to bed with me?” (if these phrases sound familiar, it may be because they form the chorus of Touch and Go’s 1990s Jazz-pop hit “Would You Go To Bed With Me” – probably the only pop song whose lyrics are lifted entirely from the methods section of a research paper).

In Clark and Hatfield’s research, both men and women were approached (always by volunteers of the opposite sex). The crucial measure was whether they said yes or no. And you can probably guess the results: although men and women were equally likely to accept the offer of a date (about half said yes and half said no), the two sexes differed dramatically in how they responded to the offer of casual sex. None of the women approached took up the offer of sex with a complete stranger. Three-quarters of the men did (yes, more than were willing to just go on a date with a complete stranger).

A matter of interpretation

But since this experiment, controversy has raged about how it should be interpreted. One school of thought is that men and women make different choices because of different sex drives, sex drives which are different for deeply seated biological reasons to do with the logic of evolution. Because, this logic goes, there is a hard limit on how many children a women can have she should be focused on quality in her sexual partners – she wants them to invest in parenting, or at the very least make a high-grade genetic contribution. If she has a child with the wrong partner, she uses up one of a very limited number of opportunities to reproduce. So she should be choosy.

A man on the other hand, shouldn’t be so concerned about quality. There’s no real limit on the number of children he can have, if he has them with different women, so he should grab every sexual opportunity he can, regardless of the partner. The costs are low, there are only benefits.

This evolutionary logic, relentlessly focused as it is on reproduction and survival, does provide a consistent explanation for the differences Clark and Hatfield observed, but it isn’t the only explanation.

The problem is that the participants in this experiment aren’t abstract representatives of all human men and women. They are particular men and women from a particular place and time, who exist in a particular social context – university students in American society at the end of the 20th century. And our society treats men and women very differently. So how about this alternate take: maybe men and women’s sex drives are pretty similar, but the experiment just measures behaviour which is as shaped by society as much as biology.

Taking out the social factor

This month, new research published in the journal Archives of Sexual Behavior, gives a vital handle on the question of whether women really don’t want sex as much as men do.

Two German researchers, Andreas Baranowski and Heiko Hecht, replicated the original Clark and Hatfield study, but with some vital changes. First they showed that the original result still held, even among German university students in the 21st century – and they showed that it still held if you asked people in a nightclub rather than on campus. But the pair reasoned that one factor in how women respond to invitations to sex may be fear – fear of reputational damage in a culture which judges women’s sexual activity differently from men’s, and fear of physical harm from an encounter with a male stranger. They cite one study which found that 45% of US women have experienced sexual violence of some kind.

So, in order to find out if women in these experiments were held back by fear, they designed an elaborate cover scenario designed to make the participants believe they could accept offers of sex without fear of anyone finding out, or of physical danger. Participants were invited into a lab under the ruse that they would be helping a dating company evaluate their compatibility rating algorithm. They were presented with ten pictures of members of the opposite sex and led to believe that all ten had already agreed to meet up with them (either for a date, or for sex). With these, and a few other convincing details, the experimenters hoped that participants would reveal their true attitudes to dating, or hooking up for sex with, total strangers, unimpeded by fear of what might happen to them if they said yes.

The results were dramatic. Now there was no difference between the dating and the casual sex scenarios, large proportions of both men and women leap at the chance to meet up with a stranger with the potential for sex – 100% of the men and 97% of the women in the study chose to meet up for a date or sex with at least one partner. The women who thought they had the chance to meet up with men for sex, chose an average of slightly less than three men who they would like to have an encounter with. The men chose an average of slightly more than three women who they would like to have an encounter with.

Men are from Earth – and so are women

The study strongly suggests that the image of women as sexually choosy and conservative needs some dramatic qualification. In the right experimental circumstances, women’s drive for casual sex looks similar to men’s. Previous experiments had leapt to a conclusion about biology, when they’d actually done experiments on behaviour which is part-determined by society. It’s an important general lesson for anyone who wants to draw conclusions about gender differences, in whatever area of behaviour.

There was still a gender difference in this new experiment – men chose more partners out of ten to meet up with, but still we can’t say that the effect of our culture was washed out. All the people in the experiment were brought up to expect different attitudes to their sexual behaviour based on their gender and to expect different risks of saying yes to sexual encounters (or of saying yes and then changing their minds).

Even with something as biological as sex, when studying human nature it isn’t easy to separate out the effect of society on how we think, feel and act. This new study gives an important update to an old research story which too many have interpreted as saying something about unalterable differences between men and women. The real moral may be about the importance of completely alterable differences in the way society treats men and women.

The Conversation

This article was originally published on The Conversation.
Read the original article.

An instinct for fairness lurking within even the most competitive

It stings when life’s not fair – but what happens if it means we profit? As Tom Stafford writes, some people may perform unexpected self-sabotage.

Frans de Waal, a professor of primate behaviour at Emory University, is the unlikely star of a viral video. His academic’s physique, grey jumper and glasses aren’t the usual stuff of a YouTube sensation. But de Waal’s research with monkeys, and its implications for human nature, caught the imagination of millions of people.

It began with a TED talk in which de Waal showed the results of one experiment that involved paying two monkeys unequally (see video, below). Capuchin monkeys that lived together were taken to neighbouring cages and trained to hand over small stones in return for food rewards. The researchers found that a typical monkey would happily hand over stone after stone when it was rewarded for each exchange with a slice of cucumber.

But capuchin monkeys prefer grapes to cucumber slices. If the researchers paid one of the monkeys in grapes instead, the monkey in the neighbouring cage – previously happy to work for cucumber – became agitated and refused to accept payment in cucumber slices. What had once been acceptable soon became unacceptable when it was clear a neighbour was getting a better reward for the same effort.

The highlight of the video is when the poorly paid monkey throws the cucumber back at the lab assistant trying to offer it as a reward.

You don’t have to be a psychologist to know that humans can feel very much like the poorly paid monkey. Injustice stings. These results and others like them, argues de Waal, show that moral sentiments are part of our biological inheritance, a consequence of an ancestral life that was dominated by egalitarian group living – and the need for harmony between members of the group.

That’s a theory, and de Waal’s result definitely shows that our evolutionary cousins, the monkeys, are strongly influenced by social comparisons. But the experiment doesn’t really provide strong evidence that monkeys want justice. The underpaid monkey gets angry, but we’ve no evidence that the better-paid monkey is unhappy about the situation. In humans, by comparison, we can find stronger evidence that an instinct for fairness can lurk inside the psyche of even the most competitive of us.

The players in the National Basketball Association in the USA rank as some of the highest earning sportspeople in the world. In the 2007-08 season the best paid of them received salaries in excess of $20 million (£13.5 million), and more than 50 members of the league had salaries of $10 million (£6.7 million) or more.

The 2007-08 season is interesting because that is when psychologists Graeme Haynes and Thomas Gilovich reviewed recordings of more than 100 NBA games, looking for occasions that fouls were called by the referees when it was clear to the players that no foul had actually been committed. Whenever a foul is called, the wronged player gets a number of free throws – chances to score points for their team. Haynes and Gilovich were interested in how these ultra-competitive, highly paid sportsmen reacted to being awarded free throws when they knew that they didn’t really deserve them.

Missed shot

These guys had every incentive to make the most of the free throws, however unfairly gained: after all, they make their living from winning, and the points gained from free throws could settle a match. Yet Haynes and Gilovich found that players’ accuracy from unfairly awarded free throws was unusually low. It was down compared to the free throw league average, and down compared to the individual players’ free throw personal averages. Accuracy on unfairly awarded free throws was lowest when the player’s team was ahead and didn’t need the points so much. But tellingly, it was also lower than average when the team was behind and in need of points – whether honestly or dishonestly gained.

If players in one of the most competitive and best-paid sports can apparently be put off by guilt, it suggests to me that an instinct for fairness can survive even the most ruthless environments.

At the end of the monkey clip, de Waal jokes that the behaviour parallels the way people have staged protests against Wall Street, and the greed they see there. And he’s right that our discomfort with unequal pay may be as deeply set as the monkey’s.

Yet perhaps these feelings run even deeper. The analysis of the basketball players suggests that when we stand to benefit from injustices – even if they can help justify multi-million dollar salaries – some part of us is uncomfortable with the situation, and may even work to undermine that advantage.

So don’t give up on the bankers and the multi-millionaire athletes just yet.

This is my latest column for BBC Future. The original is here.

Mind Hacks excerpts x 2

This month, Business Insider have republished a couple of chapters from Mind Hacks the book (in case you missed it, back before the blog, Mind Hacks was a book, 101 do-it-at-home psychology experiences). The excerpts are:

1. Why one of these puzzles is easy and the other is hard – which is about the Wason Selection Task, a famous example of how our ability to reason logically can be confounded (and unconfounded if you find the right format to present a problem in).

2. Why this sentence is hard to understand – which shows you how to improve your writing with a bit of elementary psychology (hint: it is about reducing working memory load). Steven Pinker covers the same advice in his new book The Sense of Style (2014).

Both excerpts show off some of the neat illustrations done for the book, as well as being a personal nostalgia trip for yours truly (it’s been ten years!)

Links: Why this sentence is hard to understand + Why one of these puzzles is easy and the other is hard