Open Science Essentials: Reproducibility

Open science essentials in 2 minutes, part 3

Let’s define it this way: reproducibility is when your experiment or data analysis can be reliably repeated. It isn’t replicability, which we can define as reproducing an experiment and subsequent analysis and getting qualitatively similar results with the new data. (These aren’t universally accepted definitions, but they are common, and enough to get us started).

Reproducibility is a bedrock of science – we all know that our methods section should contain enough detail to allow an independent researcher to repeat our experiment. With the increasing use of computational methods in psychology, there’s increasing need – and increasing ability – for us to share more than just a description of our experiment or analysis.

Reproducible methods

Using sites like the Open Science Framework you can share stimuli and other materials. If you use open source experiment software like PsychoPy or Tatool you can easily share the full scripts which run your experiment and people on different platforms and without your software licenses can still run your experiment.

Reproducible analysis

Equally important is making your analysis reproducible. You’d think that with the same data, another person – or even you in the future – would get the same results. Not so! Most analyses include thousands of small choices. A mis-step in any of these small choices – lost participants, copy/paste errors, mis-labeled cases, unclear exclusion criteria – can derail an analysis, meaning you get different results each time (and different results from what you’ve published).

Fortunately a solution is at hand! You need to use analysis software that allows you to write a script to convert your raw data into your final output. That means no more Excel sheets (no history of what you’ve done = very bad – don’t be these guys) and no more point-and-click SPSS analysis.

Bottom line: You must script your analysis – trust me on this one

Open data + code

You need to share and document your data and your analysis code. All this is harder work than just writing down the final result of an analysis once you’ve managed to obtain it, but it makes for more robust analysis, and allows someone else to reproduce your analysis easily in the future.

The most likely beneficiary is you – you most likely collaborator in the future is Past You, and Past You doesn’t answer email. Every analysis I’ve ever done I’ve had to repeat, sometimes years later. It saves time in the long run to invest in making a reproducible analysis first time around.

Further Reading

Nick Barnes: Publish your computer code: it is good enough

British Ecological Society: Guide to Reproducible Code

Gael Varoquaux : Computational practices for reproducible science

Advanced

Reproducible Computational Workflows with Continuous Analysis

Best Practices for Computational Science: Software Infrastructure and Environments for Reproducible and Extensible Research

Part of a series for graduate students in psychology.
Part 1: pre-registration.
Part 2: the Open-Science Framework.

Part 3: Reproducibility

The Human Advantage

In ‘The Human Advantage: How Our Brains Became Remarkable’, Suzana Herculano-Houzel weaves together two stories: the story of her scientific career, based on her invention of a new technique for counting the number of brain cells in an entire brain, and the story of human brain evolution.

Previously counts of neurons in brains of humans and other animals relied on sampling: counting the cells in a slice of tissue and multiplying up to get an estimate. Because of differences in cell types and numbers across brain regions, these estimates are uncertain. Herculano-Houzel’s technique involves liquidizing a whole brain or brain region so that a sample of this homogeneous mass can yield reliable estimates of total cell count. Herculano-Houzel calls it “brain soup”.

The Human Advantage is the story of her discovery and the collaborations that led her to apply the technique to rodent, primate and human brains, and eventually to everything from giraffes to elephants.

Along the way she made various discoveries that contradict received wisdom in neuroscience:
most species (including rodents primates) have 80% of the neurons in the cerebellum
humans have about 86 billion neurons (16.3 billion in cerebral cortex), which is a missing 14 billion neurons compared to the conventional estimate.
– you can’t compare brain size to count brain cells. Because the cell volume changes with body size, some species with bigger brains have fewer neurons, and species with the same size brains can have vastly different neuron counts.

Example 1
* The capybara (a rodent), cerebral cortex has a weight of 48.2g and 306 million neurons
* The bonnet monkey (a primate), cerebral cortex has a weight of 48.3g and 1.7 billion neurons

Example 2
* African elephant, body mass 5000 kg, brain mass 4619g, 5.6 billion cerebral cortex neurons
* Human, body mass 70 kg, brain mass 1509g, 16.3 billion cerebral cortex neurons

(Fun fact:elephant neurons are 98% in the cerebellum – possibly because of the evolution of the trunk).

A lot of the book is concerned with relative as well as absolute numbers of brain cells. A frequent assumption is that humans must have more cortex relative to the rest of their brain, or more prefrontal cortex relative to the rest of the cortex. This is not true, says Herculano-Houzel’s research. The exception in nature is primates, who show a greater density of neurons per gram of brain mass and more energetically efficient neurons in terms of metabolic requirement per neuron. Humans are no exception to the scaling laws that govern primates, but we are particularly large (a caveat is great apes, who have larger bodies than us, but smaller brains, departing from the body-brain scaling law that govern humans and other primates). Our cognitive exceptionalism is based on raw number of brain cells in the cortex – that’s the human advantage.

This is a book which blends a deep look into comparative neuroanatomy and the evolutionary story of the brain with the specific research programme of one scientist. It shows how much progress in science depends on technological innovation, hard work, a bit of luck, social connections and thoughtful integration of the ideas of others. A great book – mindhacks.com recommends!

Conspiracy theories as maladaptive coping

A review called ‘The Psychology of Conspiracy Theories‘ sets out a theory of why individuals end up believing Elvis is alive, NASA faked the moon landings or 9/11 was an inside job. Karen Douglas and colleagues suggest:

Belief in conspiracy theories appears to be driven by motives that can be characterized as epistemic (understanding one’s environment), existential (being safe and in control of one’s environment), and social (maintaining a positive image of the self and the social group).

In their review they cover evidence showing that factors like uncertainty about the world, lack of control or social exclusion (factors affecting epistemic, existential and social motives respectively) are all associated with increased susceptibility to conspiracy theory beliefs.

But also they show, paradoxically, that exposure to conspiracy theories doesn’t salve these needs. People presented with pro-conspiracy theory information about vaccines or climate change felt a reduced sense of control and increased disillusion with politics and distrust of government. Douglas’ argument is that although individuals might find conspiracy theories attractive because they promise to make sense of the world, they actually increase uncertainty and decrease the chance people will take effective collective action.

My take would be that, viewed like this, conspiracy theories are a form of maladaptive coping. The account makes sense of why we are all vulnerable to conspiracy theories – and we are all vulnerable; many individual conspiracy theories have very widespread subscription – for example half of Americans believe Lee Harvey Oswald did not act alone in the assassination of JFK. Of course polling about individual beliefs must underestimate the proportion of individuals who subscribe to at least one conspiracy theory. The account also makes sense of why some people are more susceptible than others – people who have less education, are more excluded or powerless and have a heightened need to see patterns which aren’t necessarily there.

There are a few areas where this account isn’t fully satisfying.
– it doesn’t really offer a psychologically grounded definition of conspiracy theories. Douglas’s working definition is ‘explanations for important events that involve secret plots by powerful and malevolent groups’, which seems to include some cases of conspiracy beliefs which aren’t ‘conspiracy theories’ (sometimes it is reasonable to believe in secret plots by the powerful; sometimes the powerful are involved in secret plots), and it seems to miss some cases of conspiracy-theory type reasoning (for example paranoid beliefs about other people in your immediate social world).
– one aspects of conspiracy theories is that they are hard to disprove, with, for example, people presenting contrary evidence seem as confirming the existence of the conspiracy. But the common psychological tendency to resist persuasion is well known. Are conspiracy theories especially hard to shift, any more than other beliefs (or the beliefs of non-conspiracy theorists)? Would it be easier to persuade you that the earth is flat than it would be to persuade a flat-earther that the earth is round? If not, then the identifying mark of conspiracy theories may be the factors that lead you to get into them, rather that their dynamics when you’ve got them.
– and how you get into them seems crucially unaddressed by the experimental psychology methods Douglas and colleagues deploy. We have correlational data on the kinds of people who subscribe to conspiracy theories, and experimental data on presenting people with conspiracy theories, but no rich ethnographic account of how individuals find themselves pulled into the world of a conspiracy theory (or how they eventually get out of it).

Further research is, as they say, needed.

Reference: Douglas, K., Sutton, R. M., & Cichocka, A. (2017). The psychology of conspiracy theories. Current Directions in Psychological Science, 26 (6), 538-542.

Karen Douglas’ homepage

Previously on mindhacks.com: Conspiracy theory as character flaw, That’s what they want you to believe. Conspiracy theory page on mindhacks wiki.

I saw Karen Douglas present this work at a talk to Sheffield Skeptics in the Pub. Thanks to them for organising.

Cyberselves: How Immersive Technologies Will Impact Our Future Selves

We’re happy to announce the re-launch of our project ‘Cyberselves: How Immersive Technologies Will Impact Our Future Selves’. Straight out of Sheffield Robotics, the project aims to explore the effects of technology like robot avatars, virtual reality, AI servants and other tech which alters your perception or ability to act. We’re interested in work, play and how our sense of ourselves and our bodies is going to change as this technology becomes more and more widespread.

We’re funded by the AHRC to run workshops and bring our roadshow of hands on cyber-experiences to places across the UK in the coming year. From the website:

Cyberselves will examine the transforming impact of immersive technologies on our societies and cultures. Our project will bring an immersive, entertaining experience to people in unconventional locations, a Cyberselves Roadshow, that will give participants the chance to transport themselves into the body of a humanoid robot, and to experience the world from that mechanical body. Visitors to the Roadshow will also get a chance to have hands-on experiences with other social robots, coding and virtual/augmented reality demonstrations, while chatting to Sheffield Robotics’ knowledgeable researchers.

The project is a follow-up to our earlier AHRC project, ‘Cyberselves in Immersive Technologies‘, which brought together robotics engineers, philosophers, psychologists, scholars of literature, and neuroscientists.

We’re running a workshop on the effects of teleoperation and telepresence, in Oxford in February (Link).

Call for papers: symposium on AI, robots and public engagement at 2018 AISB Convention (April 2018).

Project updates on twitter, via Dreaming Robots (‘Looking at robots in the news, films, literature and the popular imagination’).

Full disclosure: This is a work gig, so I’m effectively being paid to write this

Scientific Credibility and The Kardashian Index

 

The Kardashian index is a semi-humorous metric invented to the reveal how much trust you should put in a scientist with a public image.

In ‘The Kardashian index: a measure of discrepant social media profile for scientists‘, the author writes:

I am concerned that phenomena similar to that of Kim Kardashian may also exist in the scientific community. I think it is possible that there are individuals who are famous for being famous

and

a high K-index is a warning to the community that researcher X may have built their public profile on shaky foundations, while a very low K-index suggests that a scientist is being undervalued. Here, I propose that those people whose K-index is greater than 5 can be considered ‘Science Kardashians’

13059_2014_Article_424_Fig1_HTML
Figure 1 from Hall, N. (2014). The Kardashian index: a measure of discrepant social media profile for scientists. Genome biology, 15(7), 424.

Your Kardashian index is calculated from your number of twitter followers and the number of citations your scholarly papers have. You can use the ‘Kardashian Index Calculator‘ to find out your own Kardashian Index, if you have a twitter account and a Google Scholar profile.

The implication of the Kardashian index is that the Foundation of someone’s contribution to public debate about science is their academic publishing. But public debate and scholarly debate are rightfully different things, even if related. To think that only scientists should be listened to in public debate is to think that other forms of skill and expertise aren’t relevant, including the skill of translating between different domains of expertise.

Communicating scientific topics, explaining and interpreting new findings and understanding the relevance of science to people’s lives and of people’s lives to science are skills in itself. The Kardashian Index ignores that, and so undervalues it.

Full disclosure: My Kardashian Index is 25.

Open Science Essentials: The Open Science Framework

Open science essentials in 2 minutes, part 2

The Open Science Framework (osf.io) is a website designed for the complete life-cycle of your research project – designing projects; collaborating; collecting, storing and sharing data; sharing analysis scripts, stimuli, results and publishing results.

You can read more about the rationale for the site here.

Open Science is fast becoming the new standard for science. As I see it, there are two major drivers of this:

1. Distributing your results via a slim journal article dates from the 17th century. Constraints on the timing, speed and volume of scholarly communication no longer apply. In short, now there is no reason not to share your full materials, data, and analysis scripts.

2. The Replicability crisis means that how people interpret research is changing. Obviously sharing your work doesn’t automatically make it reliable, but since it is a costly signal, it is a good sign that you take the reliability of your work seriously.

You could share aspects of your work in many ways, but the OSF has many benefits

  • the OSF is backed by serious money & institutional support, so the online side of your project will be live many years after you publish the link
  • It integrates with various other platform (github, dropbox, the PsyArXiv preprint server)
  • Totally free, run for scientists by scientists as a non-profit

All this, and the OSF also makes easy things like version control and pre-registration.

Good science is open science. And the fringe benefit is that making materials open forces you to properly document everything, which makes you a better collaborator with your number one research partner – your future self.

Cross-posted at tomstafford.staff.shef.ac.uk.  Part of a series aimed at graduate students in psychology. Part 1: pre-registration.

 

Open Science Essentials: pre-registration

Open Science essentials in 2 minutes, part 1

The Problem

As a scholarly community we allowed ourselves to forget the distinction between exploratory vs confirmatory research, presenting exploratory results as confirmatory, presenting post-hoc rationales as predictions. As well as being dishonest, this makes for unreliable science.

Flexibility in how you analyse your data (“researcher degrees of freedom“) can invalidate statistical inferences.

Importantly, you can employ questionable research practices like this (“p-hacking“) without knowing you are doing it. Decide to stop an analysis because the results are significant? Measure 3 dependent variables and use the one that “works”? Exclude participants who don’t respond to your manipulation? All justified in exploratory research, but mean you are exploring a garden of forking paths in the space of possible analysis – when you arrive at a significant result, you won’t be sure you got there because of the data, or your choices.

The solution

There is a solution – pre-registration. Declare in advance the details of your method and your analysis: sample size, exclusion conditions, dependent variables, directional predictions.

You can do this

Pre-registration is easy. There is no single, universally accepted, way to do it.

  • you could write your data collection and analysis plan down and post it on your blog.
  • you can use the Open Science Framework to timestamp and archive a pre-registration, so you can prove you made a prediction ahead of time.
  • you can visit AsPredicted.org which provides a form to complete, which will help you structure your pre-registration (making sure you include all relevant information).
  • Registered Reports“: more and more journals are committing to published pre-registered studies. They review the method and analysis plan before data collection and agree to publish once the results are in (however they turn out).

You should do this

Why do this?

  • credibility – other researchers (and journals) will know you predicted the results before you got them.
  • you can still do exploratory analysis, it just makes it clear which is which.
  • forces you to think about the analysis before collecting the data (a great benefit).
  • more confidence in your results.

Further reading

 

Addendum 14/11/17

As luck would have it, I stumbled across a bunch of useful extra resources in the days after publishing this post

Cross-posted on at tomstafford.staff.shef.ac.uk.  Part of a series aimed at graduate students in psychology. Part 2: The Open Science Framework