Scientific Credibility and The Kardashian Index

 

The Kardashian index is a semi-humorous metric invented to the reveal how much trust you should put in a scientist with a public image.

In ‘The Kardashian index: a measure of discrepant social media profile for scientists‘, the author writes:

I am concerned that phenomena similar to that of Kim Kardashian may also exist in the scientific community. I think it is possible that there are individuals who are famous for being famous

and

a high K-index is a warning to the community that researcher X may have built their public profile on shaky foundations, while a very low K-index suggests that a scientist is being undervalued. Here, I propose that those people whose K-index is greater than 5 can be considered ‘Science Kardashians’

13059_2014_Article_424_Fig1_HTML
Figure 1 from Hall, N. (2014). The Kardashian index: a measure of discrepant social media profile for scientists. Genome biology, 15(7), 424.

Your Kardashian index is calculated from your number of twitter followers and the number of citations your scholarly papers have. You can use the ‘Kardashian Index Calculator‘ to find out your own Kardashian Index, if you have a twitter account and a Google Scholar profile.

The implication of the Kardashian index is that the Foundation of someone’s contribution to public debate about science is their academic publishing. But public debate and scholarly debate are rightfully different things, even if related. To think that only scientists should be listened to in public debate is to think that other forms of skill and expertise aren’t relevant, including the skill of translating between different domains of expertise.

Communicating scientific topics, explaining and interpreting new findings and understanding the relevance of science to people’s lives and of people’s lives to science are skills in itself. The Kardashian Index ignores that, and so undervalues it.

Full disclosure: My Kardashian Index is 25.

Open Science Essentials: The Open Science Framework

Open science essentials in 2 minutes, part 2

The Open Science Framework (osf.io) is a website designed for the complete life-cycle of your research project – designing projects; collaborating; collecting, storing and sharing data; sharing analysis scripts, stimuli, results and publishing results.

You can read more about the rationale for the site here.

Open Science is fast becoming the new standard for science. As I see it, there are two major drivers of this:

1. Distributing your results via a slim journal article dates from the 17th century. Constraints on the timing, speed and volume of scholarly communication no longer apply. In short, now there is no reason not to share your full materials, data, and analysis scripts.

2. The Replicability crisis means that how people interpret research is changing. Obviously sharing your work doesn’t automatically make it reliable, but since it is a costly signal, it is a good sign that you take the reliability of your work seriously.

You could share aspects of your work in many ways, but the OSF has many benefits

  • the OSF is backed by serious money & institutional support, so the online side of your project will be live many years after you publish the link
  • It integrates with various other platform (github, dropbox, the PsyArXiv preprint server)
  • Totally free, run for scientists by scientists as a non-profit

All this, and the OSF also makes easy things like version control and pre-registration.

Good science is open science. And the fringe benefit is that making materials open forces you to properly document everything, which makes you a better collaborator with your number one research partner – your future self.

Cross-posted at tomstafford.staff.shef.ac.uk.  Part of a series aimed at graduate students in psychology. Part 1: pre-registration.

 

Open Science Essentials: pre-registration

Open Science essentials in 2 minutes, part 1

The Problem

As a scholarly community we allowed ourselves to forget the distinction between exploratory vs confirmatory research, presenting exploratory results as confirmatory, presenting post-hoc rationales as predictions. As well as being dishonest, this makes for unreliable science.

Flexibility in how you analyse your data (“researcher degrees of freedom“) can invalidate statistical inferences.

Importantly, you can employ questionable research practices like this (“p-hacking“) without knowing you are doing it. Decide to stop an analysis because the results are significant? Measure 3 dependent variables and use the one that “works”? Exclude participants who don’t respond to your manipulation? All justified in exploratory research, but mean you are exploring a garden of forking paths in the space of possible analysis – when you arrive at a significant result, you won’t be sure you got there because of the data, or your choices.

The solution

There is a solution – pre-registration. Declare in advance the details of your method and your analysis: sample size, exclusion conditions, dependent variables, directional predictions.

You can do this

Pre-registration is easy. There is no single, universally accepted, way to do it.

  • you could write your data collection and analysis plan down and post it on your blog.
  • you can use the Open Science Framework to timestamp and archive a pre-registration, so you can prove you made a prediction ahead of time.
  • you can visit AsPredicted.org which provides a form to complete, which will help you structure your pre-registration (making sure you include all relevant information).
  • Registered Reports“: more and more journals are committing to published pre-registered studies. They review the method and analysis plan before data collection and agree to publish once the results are in (however they turn out).

You should do this

Why do this?

  • credibility – other researchers (and journals) will know you predicted the results before you got them.
  • you can still do exploratory analysis, it just makes it clear which is which.
  • forces you to think about the analysis before collecting the data (a great benefit).
  • more confidence in your results.

Further reading

 

Addendum 14/11/17

As luck would have it, I stumbled across a bunch of useful extra resources in the days after publishing this post

Cross-posted on at tomstafford.staff.shef.ac.uk.  Part of a series aimed at graduate students in psychology. Part 2: The Open Science Framework

Why we need to get better at critiquing psychiatric diagnosis

This piece is based on my talk to the UCL conference ‘The Role of Diagnosis in Clinical Psychology’. It was aimed at an audience of clinical psychologists but should be of interest more widely.

I’ve been a longterm critic of psychiatric diagnoses but I’ve become increasingly frustrated by the myths and over-generalisations that get repeated and recycled in the diagnosis debate.

So, in this post, I want to tackle some of these before going on to suggest how we can critique diagnosis more effectively. I’m going to be referencing the DSM-5 but the examples I mention apply more widely.

“There are no biological tests for psychiatric diagnoses”

“The failure of decades of basic science research to reveal any specific biological or psychological marker that identifies a psychiatric diagnosis is well recognised” wrote Sami Timini in the International Journal of Clinical and Health Psychology. “Scientists have not identified a biological cause of, or even a reliable biomarker for, any mental disorder” claimed Brett Deacon in Clinical Psychology Review. “Indeed”, he continued “not one biological test appears as a diagnostic criterion in the current DSM-IV-TR or in the proposed criteria sets for the forthcoming DSM-5”. Jay Watts writing in The Guardian states that “These categories cannot be verified with objective tests”.

Actually there are very few DSM diagnoses for which biological tests are entirely irrelevant. Most use medical tests for differential diagnosis (excluding other causes), some DSM diagnoses require them as one of a number of criteria, and a handful are entirely based on biological tests. You can see this for yourself if you take the radical scientific step of opening the DSM-5 and reading what it actually says.

There are some DSM diagnoses (the minority) for which biological tests are entirely irrelevant. Body dysmorphic disorder (p242), for example, a diagnosis that describes where people become overwhelmed with the idea that a part of their body is misshapen or unattractive, is purely based on reported experiences and behaviour. No other criteria are required or relevant.

For most common DSM diagnoses, biological tests are relevant but for the purpose of excluding other causes. For example, in many DSM diagnoses there is a general exclusion that the symptoms must be not attributable to the physiological effects of a substance or another medical condition (this appears in schizophrenia, OCD, generalized anxiety disorder and many many others). On occasion, very specific biological tests are mentioned. For example, to make a confident diagnosis of panic disorder (p208), the DSM-5 recommends testing serum calcium levels to exclude hyperparathyroidism – which can produce similar symptoms.

Additionally, there are a range of DSM diagnoses for which biomedical tests make up one or more of the formally listed criteria but aren’t essential to make the diagnosis. The DSM diagnosis of narcolepsy (p372) is one example, which has two such criteria: “Hypocretin deficiency, as measured by cerebrospinal fluid (CSF) hypocretin-1 immunoreactivity values of one-third or less of those obtained in healthy subjects using the same assay, or 110 pg/mL or less” and polysomnography showing REM sleep latency of 15 minutes or less. Several other diagnoses work along these lines – where a biomedical tests results are listed but are not necessary to make the diagnosis: the substance/medication-induced mental disorders, delirium, neuroleptic malignant syndrome, neurocognitive disorders, and so on.

There are also a range of DSM diagnoses that are not solely based on biomedical tests but for which positive test results are necessary for the diagnosis. Anorexia nervosa (p338) is the most obvious, which requires the person to have a BMI of less than 17, but this applies to various sleep disorders (e.g. REM sleep disorder which requires a positive polysomnography or actigraphy finding) and some disorders due to other medical conditions. For example, neurocognitive disorder due to prion disease (p634) requires a brain scan or blood test.

There are some DSM diagnoses which are based exclusively on biological test results. These are a number of sleep disorders (obstructive sleep apnea hypopnea, central sleep apnea and sleep-related hypoventilation, all diagnosed with polysomnography).

“Psychiatric diagnoses ‘label distress'”

The DSM, wrote Peter Kinderman and colleagues in Evidence-Based Mental Health is a “franchise for the classification and diagnosis of human distress”. The “ICD is based on exactly the same principles as the DSM” argued Lucy Johnstone, “Both systems are about describing people’s distress in terms of medical diagnosis”

In reality, some psychiatric diagnoses do classify distress, some don’t.

Here is a common criterion in many DSM diagnoses: “The symptoms cause clinical significant distress or impairment in social, occupational or other important areas of functioning”

The theory behind this is that some experiences or behaviours are not considered of medical interest unless they cause you problems, which is defined as distress or impairment. Note however, that it is one or the other. It is still possible to be diagnosed if you’re not distressed but still find these experiences or behaviours get in the way of everyday life.

However, there are a whole range of DSM diagnoses for which distress plays no part in making the diagnosis.

Here is a non-exhaustive list: Schizophrenia, Tic Disorders, Delusional Disorder, Developmental Coordination Disorder, Brief Psychotic Disorder, Schizophreniform Disorder, Manic Episode, Hypomanic Episode, Schizoid Personality Disorder, Antisocial Personality Disorder, and so on. There are many more.

Does the DSM ‘label distress’? Sometimes. Do all psychiatric diagnoses? No they don’t.

“Psychiatric diagnoses are not reliable”

The graph below shows the inter-rater reliability results from the DSM-5 field trial study. They use a statistical test called Cohen’s Kappa to test how well two independent psychiatrists, assessing the same individual through an open interview, agree on a particular diagnosis. A score above 0.8 is usually considered gold standard, they rate anything above 0.6 in the acceptable range.

The results are atrocious. This graph is often touted as evidence that psychiatric diagnoses can’t be made reliably.

However, here are the results from a study that tested diagnostic agreement on a range of DSM-5 diagnoses when psychiatrists used a structured interview assessment. Look down the ‘κ’ column for the reliability results. Suddenly they are much better and are all within the acceptable to excellent range.

This is well-known in mental health and medicine as a whole. If you want consistency, you have to use a structured assessment method.

While we’re here, let’s tackle an implicit assumption that underlies many of these critiques: supposedly, psychiatric diagnoses are fuzzy and unreliable, whereas the rest of medicine makes cut-and-dry diagnoses based on unequivocal medical test results.

This is a myth based on ignorance about how medical diagnoses are made – almost all involve human judgement. Just look at the between-doctor agreement results for some diagnoses in the rest of medicine (which include the use of biomedical tests):

Diagnosis of infection at the site of surgery (0.44), features of spinal tumours (0.19 – 0.59), bone fractures in children (0.71), rectal bleeding (0.42), paediatric stroke (0.61), osteoarthritis in the hand (0.60 – 0.82). There are many more examples in the medical literature which you can see for yourself.

The reliability of DSM-5 diagnoses is typically poor for ‘off the top of the head’ diagnosis but this can be markedly improved by using a formal diagnostic assessment. This doesn’t seem to be any different from the rest of medicine.

“Psychiatric diagnoses are not valid because they are decided by a committee”

I’m sorry to break it to you, but all medical diagnoses are decided by committee.

These committees shift the boundaries, revise, reject and resurrect diagnoses across medicine. The European Society of Cardiology revise the diagnostic criteria for heart failure and related problems on a yearly basis. The International League Against Epilepsy revise their diagnoses of different epilepsies frequently – they just published their revised manual earlier this year. In 2014 they broadened the diagnostic criteria for epilepsy meaning more people are now classified as having epilepsy. Nothing changed in people’s brains, they just made a group decision.

In fact, if you look at the medical literature, it’s abuzz with committees deciding, revising and rejecting diagnostic criteria for medical problems across the board.

Humans are not cut-and-dry. Neither are most illnesses, diseases and injuries, and decisions about what a particular diagnosis should include is always a trade-off between measurement accuracy, suffering, outcome, and the potential benefits of intervention. This gets revised by a committee who examine the best evidence and come to a consensus on what should count as a medically-relevant problem.

These committees aren’t perfect. They sometimes suffer from fads and group think, and pharmaceutical industry conflicts of interest are a constant concern, but the fact that a committee decides a diagnosis does not make it invalid. I would argue that psychiatry is more prone to fads and pressure from pharmaceutical company interests than some other areas of medicine although it’s probably not the worst (surgery is notoriously bad in this regard). However, having a diagnosis decided by committee doesn’t make it invalid. Actually, on balance, it’s probably the least worst way of doing it.

“Psychiatric diagnoses are not valid because they’re based on experience, behaviour or value judgements”

We’ve discussed above how DSM diagnoses rely on medical tests to varying degrees. But the flip side of this, is that there are many non-psychiatric diagnoses which are also only based on classifying experience and/or behaviour. If you think this makes a diagnosis invalid or ‘not a real illness’ I look forward to your forthcoming campaigning to remove the diagnoses of tinnitus, sensory loss, many pain syndromes, headache, vertigo and the primary dystonias, for example.

To complicate things further, we know some diseases have a clear basis in terms of tissue damage but the diagnosis is purely based on experience and/or behaviour. The diagnosis of Parkinson’s disease, for example, is made this way and there are no biomedical tests that confirm the condition, despite the fact that studies have shown it occurs due to a breakdown of dopamine neurons in the nigrostriatal pathway of the brain.

At this point, someone usually says “but no one doubts that HIV or tuberculosis are diseases, whereas psychiatric diagnosis involves arbitrary decisions about what is considered pathological”. Cranks aside, the first part is true. It’s widely accepted – rightly so – that HIV and tuberculosis are diseases. However, it’s interesting how many critics of psychiatric diagnosis seem to have infectious diseases as their comparison for what constitutes a ‘genuine medical condition’ when infectious diseases are only a small minority of the diagnoses in medicine.

Even here though, subjectivity still plays a part. Rather than focusing on a single viral or bacterial infection, think of all viruses and bacteria. Now ask, which should be classified as diseases? This is not as cut-and-dry as you might think because humans are awash with viruses and bacteria, some helpful, some unhelpful, some irrelevant to our well-being. Ed Yong’s book I Contain Multitudes is brilliant on this if you want to know more about the massive complexity of our microbiome and how it relates to our well-being.

So the question for infectious disease experts is at what point does an unhelpful virus or bacteria become a disease? This involves making judgements about what should be considered a ‘negative effect’. Some are easy calls to make – mortality statistics are a fairly good yardstick. No one’s argued over the status of Ebola as a disease. But some cases are not so clear. In fact, the criteria for what constitutes a disease, formally discussed as how to classify the pathogenicity of microorganisms, can be found as a lively debate in the medical literature.

So all diagnoses in medicine involve a consensus judgement about what counts as ‘bad for us’. There is no biological test that which can answer this question in all cases. Value judgements are certainly more common in psychiatry than infectious diseases but probably less so than in plastic surgery, but no diagnosis is value-free.

“Psychiatric diagnosis isn’t valid because of the following reasons…”

Debating the validity of diagnoses is a good thing. In fact, it’s essential we do it. Lots of DSM diagnoses, as I’ve argued before, poorly predict outcome, and sometimes barely hang together conceptually. But there is no general criticism that applies to all psychiatric diagnoses. Rather than going through all the diagnoses in detail, look at the following list of DSM-5 diagnoses and ask yourself whether the same commonly made criticisms about ‘psychiatric diagnosis’ could be applied to them all:

Tourette’s syndrome, Insomnia, Erectile Disorder, Schizophrenia, Bipolar, Autism, Dyslexia, Stuttering, Enuerisis, Catatonia, PTSD, Pica, Sleep Apnea, Pyromania, Medication-Induced Acute Dystonia, Intermittent Explosive Disorder

Does psychiatric diagnosis medicalise distress arising from social hardship? Hard to see how this applies to stuttering and Tourette’s syndrome. Is psychiatric diagnosis used to oppress people who behave differently? If this applies to sleep apnea, I must have missed the protests. Does psychiatric diagnosis privilege biomedical explanations? I’m not sure this applies to PTSD.

There are many good critiques on the validity of specific psychiatric diagnoses, it’s impossible to see how they apply to all diagnoses.

How can we criticise psychiatric diagnosis better?

I want to make clear here that I’m not a ‘defender’ of psychiatric diagnosis. On a personal basis, I’m happy for people to use whatever framework they find useful to understand their own experiences. On a scientific basis, some diagnoses seem reasonable but many are a really poor guide to human nature and its challenges. For example, I would agree with other psychosis researchers that the days of schizophrenia being a useful diagnosis are numbered. By the way, this is not a particularly radical position – it has been one of the major pillars of the science of cognitive neuropsychiatry since it was founded.

However, I would like to think I am a defender of actually engaging with what you’re criticising. So here’s how I think we could move the diagnosis debate on.

Firstly, RTFM. Read the fucking manual. I’m sorry, but I’ve got no time for criticisms that can be refuted simply by looking at the thing you’re criticising. Saying there are no biological tests for DSM diagnoses is embarrassing when some are listed in the manual. Saying the DSM is about ‘labelling distress’ when many DSM diagnoses do not will get nothing more than an eye roll from me.

Secondly, we need be explicit about what we’re criticising. If someone is criticising ‘psychiatric diagnosis’ as a whole, they’re almost certainly talking nonsense because it’s a massively diverse field. Our criticisms about medicalisation, poor predictive validity and biomedical privilege may apply very well to schizophrenia, but they make little sense when we’re talking about sleep apnea or stuttering. Diagnosis can really only be coherently criticised on a case by case basis or where you have demonstrated that a particular group of diagnoses share particular characteristics – but you have to establish this first.

As an aside, restricting our criticisms to ‘functional psychiatric diagnosis’ will not suddenly make these arguments coherent. ‘Functional psychiatric diagnoses’ include Tourette’s syndrome, stuttering, dyslexia, erectile disorder, enuerisis, pica and insomnia to name but a few. Throwing them in front of the same critical cross-hairs as borderline personality disorder makes no sense. I did a whole talk on this if you want to check it out.

Thirdly, let’s stop pretending this isn’t about power and inter-professional rivalries. Many people have written very lucidly about how diagnosis is one of the supporting pillars in the power structure of psychiatry. This is true. The whole point of structural analysis is that concept, practice and power are intertwined. We criticise diagnosis, we are attacking the social power of psychiatry. This is not a reason to avoid it, and doesn’t mean this is the primary motivation, but we need to be aware of what we’re doing. Pretending we’re criticising diagnosis but not taking a swing at psychiatry is like calling someone ugly but saying it’s nothing against them personally. We should be working for a better and more equitable approach to mental health – and that includes respectful and conscious awareness of the wider implications of our actions.

Also, let’s not pretend psychology isn’t full of classifications. Just because they’re not published by the APA, doesn’t mean they’re any more valid or have the potential to be any more damaging (or indeed, the potential to be any more liberating). And if you are really against classifying experience and behaviour in any way, I recommend you stop using language, because it relies on exactly this.

Most importantly though, this really isn’t about us as professionals. The people most affected by these debates are ultimately people with mental health problems, often with the least power to make a difference to what’s happening. This needs to change and we need to respect and include a diversity of opinion and lived experience concerning the value of diagnosis. Some people say that having a psychiatric diagnosis is like someone holding their head below water, others say it’s the only thing that keeps their head above water. We need a system that supports everyone.

Finally, I think we’d be better off if we treated diagnoses more like tools, and less like ideologies. They may be more or less helpful in different situations, and at different times, and for different people, and we should strive to ensure a range of options are available to people who need them, both diagnostic and non-diagnostic. Each tested and refined with science, meaning, lived experience, and ethics.

Should we stop saying ‘commit’ suicide?

There is a movement in mental health to avoid the phrase ‘commit suicide’. It is claimed that the word ‘commit’ refers to a crime and this increases the stigma for what’s often an act of desperation that deserves compassion, rather than condemnation.

The Samaritans’ media guidelines discourage using the phrase, advising: “Avoid labelling a death as someone having ‘committed suicide’. The word ‘commit’ in the context of suicide is factually incorrect because it is no longer illegal”. An article in the Australian Psychological Society’s InPsych magazine recommended against it because the word ‘commit’ signifies not only a crime but a religious sin. There are many more such claims.

However, on the surface level, claims that the word ‘commit’ necessarily indicates a crime are clearly wrong. We can ‘commit money’ or ‘commit errors’, for instance, where no crime is implied. The dictionary entry for ‘commit’ (e.g. see the definition at the OED) has entries related to ‘committing a crime’ as only a few of its many meanings.

But we can probably do a little better when considering the potentially stigmatising effects of language than simply comparing examples.

One approach is to see how the word is actually used by examining a corpus of the English language – a database of written and transcribed spoken language – and using a technique called collocation analysis that looks at which words appear together.

I’ve used the Corpus of Contemporary American English collocation analysis for the results below and you can do the analysis yourself if you want to see what it looks like.

So here are the top 30 words that follow the word ‘commit’, in order of frequency in the corpus.

Some of the words are clearly parts of phrases (‘commit ourselves…’) rather than directly referring to actions but you can see that most common two word phrase is ‘commit suicide’ by a very large margin.

If we take this example, the argument for not using ‘commit suicide’ gets a bit circular but if we look at the other named actions as a whole, they’re all crimes or potential crimes. Essentially, they’re all fairly nasty.

If you do the analysis yourself (and you’ll have to go to the website and type in the details, you can’t link directly) you’ll see that non-criminal actions don’t appear until fairly low down the list, way past the 30 listed here.

So ‘commit’ typically refers to antisocial and criminal acts. Saying ‘commit suicide’ probably brings some of that baggage with it and we’re likely to be better off moving away from it.

It’s worth saying, I’m not a fan of prohibitions on words or phrases, as it tends to silence people who have only colloquial language at their disposal to advocate for themselves.

As this probably includes most people with mental health problems, only a minority of which will be plugged into debates around language, perhaps we are better off thinking about moving language forward rather than punishing the non-conforming.

Don’t speculate on others’ mental health from afar

In The Guardian, Nick Davis makes a clear and timely case for affirming The Goldwater Rule. The Rule, which binds members of the American Psychiatric Association, forbids giving an opinion on the mental state of someone you have not examined.

The US president’s behaviour has brought the rule back into the public eye, but Davis argues that we shouldn’t lose sight of the importance of the Rule, and how it protects us all from speculation about our mental health – speculation which is often flavoured by simple prejudice:

Read the article here: The Goldwater rule: why commenting on mental health from a distance is unhelpful 

The Enigma of Reason (review)

The Enigma of Reason: A New Theory of Human Understanding, by Hugo Mercier and Dan Sperber was published in April, and I have a review in this week’s Times Higher Education.

The books follows on and expands on their landmark ‘Why do humans reason? Arguments for an argumentative theory‘, published in 2011 in Behavioral and Brain Sciences.

The core of the argumentative theory is this (quoting my review):

reasoning is primarily a social, rather than an individual, tool. Here the purpose of reasoning is not inquisitive, but instead justificatory – we provide reasons to other people, and we evaluate the reasons provided by other people. The niche of reasoning is in the highly social world of human cooperative groups, a niche where it is highly advantageous to be able to transfer information and trust between individuals who are not kin

You can read the full review on the THE site, but I highly recommend checking out the book. It’s a fantastic example of a book which has both theoretical depth and reach, connecting fundamental theoretical perspectives across cognitive science to give a provocative and satisfying account of the nature of human reasoning.

You can also check out Hugo Mercier’s pages about the argumentative theory, which has links to experiments suggested by the theory (which have by and large confirmed predictions it makes).

Not the psychology of Joe average terrorist

News reports have been covering a fascinating study on the moral reasoning of ‘terrorists’ published in Nature Human Behaviour but it’s worth being aware of the wider context to understand what it means.

Firstly, it’s important to highlight how impressive this study is. The researchers, led by Sandra Baez, managed to complete the remarkably difficult task of getting access to, and recruiting, 66 jailed paramilitary fighters from the Colombian armed conflict to participate in the study.

They compared this group to 66 matched ‘civilians’ with no criminal background and 13 jailed murderers with no paramilitary connections, on a moral reasoning task.

The task involved 24 scenarios that varied in two important ways: harm and no harm, and intended and unintended actions. Meaning the researchers could compare across four situations – no harm, accidental harm, unsuccessfully attempted harm, and successfully attempted harm.

A consistent finding was that paramilitary participants consistently judged accidental harm as less acceptable than other groups, and intentional harm as more acceptable than others groups, indicating a distortion in moral reasoning.

They also measured cognitive function, emotion recognition and aggressive tendencies and found that when these measures were included in the analysis, they couldn’t account for the results.

One slightly curious thing in the paper though, and something the media has run with, is that the authors describe the background of the paramilitary participants and then discuss the implications for understanding ‘terrorists’ throughout.

But some context on the Colombian armed conflict is needed here.

The participants were right-wing paramilitaries who took part in the demobilisation agreement of 2003. This makes them members of the Autodefensas Unidas de Colombia or AUC – a now defunct organisation who were initially formed by drug traffickers and land owners to combat the extortion and kidnapping of the left-wing Marxist paramilitary organisations – mostly notably the FARC.

The organisation was paramilitary in the traditional sense – with uniforms, a command structure, local and regional divisions, national commanders, and written statutes. It involved itself in drug trafficking, extortion, torture, massacres, targeted killings, and ‘social cleansing’ of civilians assumed to be undesirable (homeless people, people with HIV, drug users etc) and killings of people thought to support left-wing causes. Fighters were paid and most signed up for economic reasons.

It was indeed designated a terrorist organisation by the US and EU, although within Colombia they enjoyed significant support from mainstream politicians (the reverberations of which are still being felt) and there is widespread evidence of collusion with the Colombian security forces of the time.

Also, considering that a great deal of military and paramilitary training is about re-aligning moral judgements, it’s not clear how well you can generalise these results to terrorists in general.

It is probably unlikely that the moral reasoning of people who participated in this study is akin to, for example, the jihadi terrorists who have mounted semi-regular attacks in Europe over the last few years. Or alternatively, it is not clear how ‘acceptable harm’ moral reasoning applies across different contexts in different groups.

Even within Colombia you can see how the terrorist label is not a reliable classification of a particular group’s actions and culture. Los Urabeños are the biggest drug trafficking organisation in Colombia at the moment. They are essentially the Centauros Bloc of the AUC, who didn’t demobilise and just changed their name. They are involved in very similar activities.

Importantly, they are not classified as a terrorist organisation, despite being virtually same organisation from which members were recruited into this study.

I would guess these results are probably more directly relevant in understanding paramilitary criminal organisations, like the Sinaloa Cartel in Mexico, than more ideologically-oriented groups that claim political or religious motivations, although it would be fascinating if they did generalise.

So what this study provides is a massively useful step forward in understanding moral reasoning in this particular paramilitary group, and the extent to which this applies to other terrorist, paramilitary or criminal groups is an open question.
 

Link to open access study in Nature Human Behaviour.

What triggers that feeling of being watched?

You feel somebody is looking at you, but you don’t know why. The explanation lies in some intriguing neuroscience and the study of a strange form of brain injury.

Something makes you turn and see someone watching you. Perhaps on a busy train, or at night, or when you’re strolling through the park. How did you know you were being watched? It can feel like an intuition which is separate from your senses, but really it demonstrates that your senses – particularly vision – can work in mysterious ways.

Intuitively, many of us might imagine that when you look at something with your eyes, signals travel to your visual cortex and then you have the conscious experience of seeing it, but the reality is far weirder.

Once information leaves our eyes it travels to at least 10 distinct brain areas, each with their own specialised functions. Many of us have heard of the visual cortex, a large region at the back of the brain which gets most attention from neuroscientists. The visual cortex supports our conscious vision, processing colour and fine detail to help produce the rich impression of the world we enjoy. But other parts of our brain are also processing different pieces of information, and these can be working away even when we don’t – or can’t – consciously perceive something.

The survivors of neural injury can cast some light on these mechanisms. When an accident damages the visual cortex, your vision is affected. If you lose all of your visual cortex you will lose all conscious vision, becoming what neurologists call ‘cortically blind’. But, unlike if you lose your eyes, cortically blind is only mostly blind – the non-cortical visual areas can still operate. Although you can’t have the subjective impression of seeing anything without a visual cortex, you can respond to things captured by your eyes that are processed by these other brain areas.

In 1974 a researcher called Larry Weiskrantz coined the term ‘blindsight’ for the phenomenon of patients who were still able to respond to visual stimuli despite losing all conscious vision due to destruction of the visual cortex. Patients like this can’t read or watch films or anything requiring processing of detail, but they are – if asked to guess – able to locate bright lights in front of them better than mere chance. Although they don’t feel like they can see anything, their ‘guesses’ have a surprising accuracy. Other visual brain areas are able to detect the light and provide information on the location, despite the lack of a visual cortex. Other studies show that people with this condition can detect emotions on faces and looming movements.

More recently, a dramatic study with a blindsight patient has shown how we might be able feel that we are being looked at, without even consciously seeing the watchers’ face. Alan J Pegna at Geneva University Hospital, Switzerland, and team worked with a man called TD (patients are always referred to by initials only in scientific studies, to preserve anonymity). TD is a doctor who suffered a stroke which destroyed his visual cortex, leaving him cortically blind.

People with this condition are rare, so TD has taken part in a string of studies to investigate exactly what someone can and can’t do without a visual cortex. The study involved looking at pictures of faces which had their eyes directed forward, looking directly at the viewer, or which had their eyes averted to the side, looking away from the viewer. TD did this task in an fMRI scanner which measured brain activity during the task, and also tried to guess which kind of face he was seeing. Obviously for anyone with normal vision, this task would be trivial – you would have a clear conscious visual impression of the face you were looking at at any one time, but recall that TD has no conscious visual impression. He feels blind.

The scanning results showed that our brains can be sensitive to what our conscious awareness isn’t. An area called the amygdala, thought to be responsible for processing emotions and information about faces, was more active when TD was looking at the faces with direct, rather than averted, gaze. When TD was being watched, his amygdala responded, even though he didn’t know it. (Interestingly, TD’s guesses as to where he was being watched weren’t above chance, and the researchers put this down to his reluctance to guess.)

Cortical, conscious vision, is still king. If you want to recognise individuals, watch films or read articles like this you are relying on your visual cortex. But research like this shows that certain functions are simpler and maybe more fundamental to survival, and exist separately from our conscious visual awareness.

Specifically, this study showed that we can detect that people are looking at us within our field of view – perhaps in the corner of our eye – even if we haven’t consciously noticed. It shows the brain basis for that subtle feeling that tells us we are being watched.

So when you’re walking that dark road and turn and notice someone standing there, or look up on the train to see someone staring at you, it may be your nonconscious visual system monitoring your environment while you’re conscious attention was on something else. It may not be supernatural, but it certainly shows the brain works in mysterious ways.

This is my BBC Future column from last week. The original is here.

This map shows what white Europeans associate with race – and it makes for uncomfortable reading

File 20170425 12650 16jxfww
The ConversationThis new map shows how easily white Europeans associate black faces with negative ideas. The ConversationSince 2002, hundreds of thousands of people around the world have logged onto a website run by Harvard University called Project Implicit and taken an “implicit association test” (IAT), a rapid-response task which measures how easily you can pair items from different categories.To create this new map, we used data from a version of the test which presents white or black faces and positive or negative words. The result shows how easily our minds automatically make the link between the categories – what psychologists call an “implicit racial attitude”.Each country on the map is coloured according to the average score of test takers from that country. Redder countries show higher average bias, bluer countries show lower average bias, as the scale on the top of the map shows.Like a similar map which had been made for US states, our map shows variation in the extent of racial bias – but all European countries are racially biased when comparing blacks versus whites.

In every country in Europe, people are slower to associate blackness with positive words such as “good” or “nice” and faster to associate blackness with negative concepts such as “bad” or “evil”. But they are quicker to make the link between blackness and negative concepts in the Czech Republic or Lithuania than they are in Slovenia, the UK or Ireland.

No country had an average score below zero, which would reflect positive associations with blackness. In fact, none had an average score that was even close to zero, which would reflect neither positive nor negative racial associations.

A screeshot from the online IAT test.
IAT, Project Implict

Implicit bias

Overall, we have scores for 288,076 white Europeans, collected between 2002 and 2015, with sample sizes for each country shown on the left-hand side.

Because of the design of the test it is very difficult to deliberately control your score. Many people, including those who sincerely hold non-racist or even anti-racist beliefs, demonstrate positive implicit bias on the test. The exact meaning of implicit attitudes, and the IAT, are controversial, but we believe they reflect the automatic associations we hold in our minds, associations that develop over years of immersion in the social world.

Although we, as individuals, may not hold racist beliefs, the ideas we associate with race may be constructed by a culture which describes people of different ethnicities in consistent ways, and ways which are consistently more or less positive. Looked at like this, the IAT – which at best is a weak measure of individual psychology – may be most useful if individuals’ scores are aggregated to provide a reflection on the collective social world we inhabit.

The results shown in this map give detail to what we already expected – that across Europe racial attitudes are not neutral. Blackness has negative associations for white Europeans, and there are some interesting patterns in how the strength of these negative associations varies across the continent.

North and west Europe, on average, have less strong anti-black associations, although they still have anti-black associations on average. As you move south and east the strength of negative associations tends to increase – but not everywhere. The Balkans look like an exception, compared to surrounding countries. Is this because of some quirk about how people in the Balkans heard about Project Implicit, or because their prejudices aren’t orientated around a white-black axis? For now, we can only speculate.

Open questions

When interpreting the map there are at least two important qualifications to bear in mind.

The first is that the scores only reflect racial attitudes in one dimension: pairing white/black with goodness/badness. Our feelings about ethnicity have many more dimensions which aren’t captured by this measure.

The second is that the data comes from Europeans who visit the the US Project Implicit website, which is in English. We can be certain that the sample reflects a subset of the European population which are more internet-savvy than is typical. They are probably also younger, and more cosmopolitan. These factors are likely to underweight the extent of implicit racism in each country, so that the true levels of implicit racism are probably higher than shown on this map.

This new map is possible because Project Implicit release their data via the Open Science Framework. This site allows scientists to share the raw materials and data from their experiments, allowing anyone to check their working, or re-analyse the data, as we have done here. I believe that open tools and publishing methods like these are necessary to make science better and more reliable.

This article was originally published on The Conversation. Read the original article.

Edit 4/5/17. The colour scale chosen for this map emphasises the differences between countries. While that’s most important for working out what drives IAT scores, the main take-away from the map is that all of Europe is considerably not neutral. That conclusion is supported by a continuous colour scale, as used in this version of the map here

An alternative beauty in parenthood

Vela has an amazing essay by a mother of a child with a rare chromosomal deletion. Put aside all your expectations about what this article will be like: it is about the hopes and reality of having a child, but it’s also about so much more.

It’s an insightful commentary on the social expectations foisted upon pregnant women.

It’s about the clash of folk understanding of wellness and the reality of genetic disorders.

It’s about being with your child as they develop in ways that are surprising and sometimes troubling and finding an alternative beauty in parenthood.
 

Link to Vela article SuperBabies Don’t Cry.

neurotransmitter fashion

A graph of scientific articles published per year which mention four major neurotransmitters in their title:

What I take from this is

  • Dopamine is king! And with great popularity, comes great misrepresentation.
  • What happened to glutamate research in the mid 1990s?
  • The recent hype about oxytocin doesn’t seem to be driven by a spike in the primary literature.
  • Nor does the hype about serotonin. Yes, publications increase on this neurotransmitter, but not compared to glutamate. And most people haven’t heard about glutamate, despite it being more abundant.

Technical note: I scraped the data from google scholar using scholar.py by Christian Kriebich

Update: here’s the raw data, should you want it

hormones, brain and behaviour, a not-so-simple story

There’s a simple story about sex differences in cognition, which traces these back to sex differences in early brain development, which are in turn due to hormone differences. Diagrammatically, it looks something like this:

simpleCordelia Fine’s “Delusions of Gender” (2010) accuses both scientists and popularisers of science with being too ready to believe overly simple, and biologically fixed, accounts of sex differences in cognition.

There is an undeniable sex difference in foetal testosterone in humans at around 6-8 weeks after conception. In Chapter 9 of her book, Fine introduces Simon Baron-Cohen, who seems to claim that this surge in male hormones is the reason why men are more likely to be Autistic, and why no woman had ever won the Fields Medal. So, diagrammatically:

simple_mathsThis account may appear, at first, compelling, perhaps because of its simplicity. But Fine presents us with an antidote for this initial intuition, in the form of the neurodevelopmental story of a the spinal nucleus of the bulbocavernosus (SNB), a subcortical brain area which controls muscles at the base of the penis.

Even here, the route between hormone, brain difference and behaviour is not so simple, as shown by neat experiments with rats by Celia Moore, described by Fine (p.105 in my edition). Moore showed that male rat pups are licked more by their mothers, and that this licking is due to excess testosterone in their urine. Mothers which couldn’t smell, licked male and female pups equally, and female pups injected with testosterone were licked as much as male pups. This licking had an extra developmental effect on the SNB, which could be mimicked by manual brushing of a pup’s perineum. Separate work showed that testosterone doesn’t act directly on the neurons of the SNB, but instead prevents cell death in the SNB by preserving the muscles which it connects to (in males). So, diagrammatically:

snbOne review, summarising what is known about the development of the SNB, writes ‘[There is] a life-long plasticity in even this simple system [and] evidence that adult androgens interact with social experience in order to affect the SNB system’. Not so simple!

What I love about this story is the complexity of developmental causes. Even in the rat, not the human! Even in the subcortex, not the cortex! Even in a brain area which direct controls a penis reflex. Fine’s implicit question for Baron-Cohen seems to be: If evolution creates this level of complexity for something as important for reproductive function, what is likely for the brain areas responsible for something as selectively-irrelevant as winning prizes at Mathematics?

Notice also the variety of interactions, not just the number : hormones -> body, body -> sensation in mother’s brain, brain -> behaviour, mother’s behaviour -> pup’s sensation, sensation -> cell growth. This is a developmental story which happens across hormones, brain, body, behaviour and individuals.

Against this example, sex differences in cognition due to early hormone differences look far from inevitable, and the simple hormone-brain-behaviour looks like a crude summary at best. Whether you take it to mean that sex differences in hormones have multiple routes to generate sex differences in cognition (a ‘small differences add up’ model) or that sex differences in hormones will cancel each other out, may depend on your other assumptions about development. At a minimum, the story of the SNB shows that those assumptions are worth checking.

Previously: gender brain blogging

Paper: Moore, C. L., Dou, H., & Juraska, J. M. (1992). Maternal stimulation affects the number of motor neurons in a sexually dimorphic nucleus of the lumbar spinal cord. Brain research, 572(1), 52-56.

Source for the 2009 claim by Baron-Cohen claim that foetal hormones explain why no woman has won the Fields medal: Autism test ‘could hit maths skills’.

In 2014 Maryam Mirzakhabi won the Fields medal.

Diagrams made with draw.io

A neuroscientist podcaster explains…

There’s a great ongoing podcast series called A Neuroscientist Explains that looks at some of the most important points of contact between neuroscience and the wider world.

It’s a project of The Guardian Science Weekly podcast and is hosted by brain scientist Daniel Glaser who has an interesting profile – having been a cognitive neuroscientist for many years before moving into the world of art and public engagement.

Glaser takes inspiration from culture and current affairs – which often throws up discussion about the mind or brain – and then looks at these ideas in depth, typically with one of the leading researchers in the field.

Recent episodes on empathy and music have been particularly good (although skip the first episode in the series – unusually, there’s a few clangers in it) and they manage to strike a great balance between outlining the fundamentals while debating the latest ideas and findings.

It seems you can’t link solely to the podcast but you can pick them on the page linked below.
 

Link to ‘A Neuroscientist Explains’

Why women don’t report sexual harassment

189px-milgram_experimentJulie A. Woodzicka (Washington and Lee University) and Marianne LaFrance (Yale) report an experiment reminiscent of Milgram’s famous studies of obedience to authority. Reminiscent both because it highlights the gap between how we imagine we’ll respond under pressure and how we actually do respond, and because it’s hard to imagine an ethics review board allowing it.

The study, reported in the Journal of Social Issues in 2001, involved the following (in their own words):

we devised a job interview in which a male interviewer asked female job applicants sexually harassing questions interspersed with more typical questions asked in such contexts.

The three sexually harassing questions were (1) Do you have a boyfriend? (2) Do people find you desirable? and (3) Do you think it is important for women to wear bras to work?

Participants, all women, average age 22, did not know they were in an experiment and were recruited through posters and newspaper adverts for a research assistant position.

The results illuminated what targets of harassment do not do. First, no one refused to answer: Interviewees answered every question irrespective of whether it was harassing or nonharassing. Second, among those asked the harassing questions, few responded with any form of confrontation or repudiation. Nonetheless, the responses revealed a variety of ways that respondents attempted to circumvent the situation posed by harassing questions.

Just as with the Milgram experiment, these results contrast with how participants from a companion study imagined they would respond when the scenario was described to them:

The majority (62%) anticipated that they would either ask the interviewer why he had asked the question or tell him that it was inappropriate. Further, over one quarter of the participants (28%) indicated that they would take more drastic measures by either leaving the interview or rudely confronting the interviewer. Notably, a large number of respondents (68%) indicated that they would refuse to answer at least one of the three harassing questions.

Part of the difference, the researchers argue, is that women imagining the harassing situation over-estimate the anger they will feel. When confronted with actual harassment, fear replaces anger, they claim. Women asked the harassing questions reported significantly higher rates of fear than women asked the merely surprising questions. Coding of facial expressions during the (secretly videoed) interviews revealed that the harassed women also smiled more – fake (non-Duchenne) smiles, presumably aimed at appeasing a harasser that they felt afraid of.

The research report doesn’t indicate what, if any, ethical review process the experiment was subject to.

Obviously it is an important topic, with disturbing and plausible findings. The researchers note that courts have previously interpreted inaction following harassment as indicative of some level of consent. But, despite the real-world relevance, is it a topic that is it important enough to justify employing a man to sexually harass unsuspecting women?

Reference: Woodzicka, J. A., & LaFrance, M. (2001). Real versus imagined gender harassment. Journal of Social Issues, 57(1), 15-30.

Previously: a series of Gender Brain Blogging

Much more previously: an essay I wrote arguing that moral failures are often defined by failures of imagination, not of reason: The Narrative Escape

The Social Priming Studies in “Thinking Fast and Slow” are not very replicable

train_wreck_at_montparnasse_1895In Daniel Kahneman’s “Thinking Fast and Slow” he introduces research on social priming – the idea that subtle cues in the environment may have significant, reliable effects on behaviour. In that book, published in 2011, Kahneman writes “disbelief is not an option” about these results. Since then, the evidence against the reliability of social priming research has been mounting.

In a new analysis, ‘Reconstruction of a Train Wreck: How Priming Research Went off the Rails‘, Ulrich Schimmack, Moritz Heene, and Kamini Kesavan review chapter 4 of Thinking Fast and Slow, picking out the references which provide evidence for social priming and calculating how statistically reliable they:

Their conclusion:

The results are eye-opening and jaw-dropping.  The chapter cites 12 articles and 11 of the 12 articles have an R-Index below 50.  The combined analysis of 31 studies reported in the 12 articles shows 100% significant results with average (median) observed power of 57% and an inflation rate of 43%.  …readers of… “Thinking Fast and Slow” should not consider the presented studies as scientific evidence that subtle cues in their environment can have strong effects on their behavior outside their awareness.

The argument is that the pattern of 100% significant results is near to impossible, even if the effects known were true, given the weak statistical power of the studies to detect true effects.

Remarkably, Kahneman responds in the comments:

What the blog gets absolutely right is that I placed too much faith in underpowered studies. …I have changed my views about the size of behavioral priming effects – they cannot be as large and as robust as my chapter suggested.

The original analysis, and Kahneman’s response are worth reading in full. Together they give a potted history of the replication crisis, and a summary of some of the prime causes (e.g. file draw effects), as well as showing off how mature psychological scientists can make, and respond to critique.

Original analysis: ‘Reconstruction of a Train Wreck: How Priming Research Went off the Rails‘, Ulrich Schimmack, Moritz Heene, and Kamini Kesavan. (Is it a paper? Is it a blogpost? Who knows?!)

Kahneman’s response