MAGONIA REVIEW: NOT-SO-ORIGINAL SINS

Chris Chambers, The Seven Deadly Sins of Psychology: A Manifesto for Reforming the Culture of Scientific Practice, Princeton University Press.

Chris Chambers is a professor of cognitive neuroscience at Cardiff University who has, over the course of his 15-year career, become increasingly disillusioned with the culture that prevails in the psychological sciences. This book is his summation of all that’s wrong with psychology, and what needs to be done to fix it, using the seven deadly sins as a metaphor.

🔻
These are the ‘cultural sins’ that "pose an existential threat to the discipline itself" – devoting a chapter to each (although the metaphor is bit contrived, as his sin of ‘unreliability’ covers a number of distinct transgressions, including some of his other six).

For me, the book got off to a shaky start, as Chambers’ headline example of how unscientific psychology has become is Darryl Bem’s 2011 Journal of Personality and Social Psychology paper that presented evidence for short-term precognition, asking ‘how could such a bizarre conclusion find a home in a reputable science journal?’ and ‘if accepted practices could generate such nonsensical findings, how can any published findings be trusted?’

For Chambers, the significance of Bem’s publication is that, as it meets all the normal standards of psychological research, it has finally forced those within the profession who know that precognition isn’t real to question those standards, presenting him with the perfect hook to hang the message he’s been banging his head against the wall to get across for years: ‘History may look back on 2011 as the year that changed psychology forever.’

It’s not really what his book is about, so I’ll just say by playing on the prejudice against psi Chambers is, in my view, being unfair and, ironically in light of what is to come, selective in his reporting. Bem’s study wasn’t a one-off, but the latest in a series of experiments into an effect, sometimes dubbed ‘presponse’, that have been carried out for two decades by scientists in various fields, including physics, with similar positive results.

But that aside, Chambers’ survey makes for sobering reading, exposing as it does psychology’s staggering lack of scientific rigour and the shameful practices it routinely, and blatantly, employs.

Chambers’ first sin is bias, meaning chiefly publication bias on the part of the psychological journals, which favour papers that present new, headline-grabbing and above all positive discoveries. Given the ‘publish or perish’ culture that is, according to Chambers, more prevalent in psychology than other sciences, this drives many of the other sins, as researchers play up to the journals’ biases: ‘psychology has embraced a tabloid culture where novelty and interest-value are paramount and the truth is left begging.’

Among the many harmful consequences of that culture is the discouragement of the direct replication of previously published research – something that, given that most findings in psychology are based on probabilities, should be vital. Instead, follow-up research relies on ‘conceptual replication’, in which previous findings are put to the test using different, new methods. This, Chambers argues, isn’t replication at all since it assumes, rather than tests, the truth of the original conclusions.

Chambers gives the example of research by psi-sceptic Chris French that failed to find the ‘presponse’ effect, but which was turned down for publication on the grounds that it was a direct replication of Bem’s original experiment. (However, elsewhere Chambers notes that ‘failing to replicate an effect does not necessarily mean the original finding was in error’.)

Replication is the ‘immune system of science’ that weeds out false and even fabricated results, but it is ‘largely ignored or distorted in psychology’, which even displays a ‘contempt’ for the practice. A 2012 survey found that, staggeringly, only two in every thousand papers published in psychological journals were direct replications of a previous experiment - and half of those were carried out by the same team that did the original work.

Psychology’s attitude to replication was illustrated by 2014’s ‘Repligate’: an exercise to directly reproduce a number of experiments, the results of which had stood since the 1950s, failed to confirm many of the original findings. Astonishingly, but tellingly, some in the psychological community rounded on the team responsible, labelling them ‘replication police’ and even ‘Nazis’.

Another consequence of publication bias is the dubious but widespread practice of ‘HARKing’ – Hypothesising After Results are Known – by which, if the results of an experiment don’t come out as predicted by the original hypothesis, the experimenter devises a new hypothesis that does fit and presents the research as if that was the idea all along. Studies have estimated that anything between 40 and 90 percent of published papers have been HARKed.

A similarly dodgy practice driven by publication bias, ‘p-hacking’, lies at the heart of Chambers’ second sin, ‘hidden flexibility’. The gold standard of psychological research is p, the probability of an effect being due of chance. Since the 1920s p has been set – entirely arbitrarily – at 5 percent (in statistical terminology p = .05), meaning that the odds have to be better than one in 20 that the results are down to chance before they are accepted as showing a real effect. However, if results don’t clear that bar, they are p-hacked, exploiting what are known euphemistically as ‘researcher degrees of freedom’ to justify excluding chunks of data, until the magic .05 is achieved. Chambers writes of ‘watching colleagues analysing their data a hundred different ways, praying like gamblers at a roulette wheel for signs of statistical significance.’

The practice also compounds the conceptual replication problem: ‘A p-hacked conceptual replication of a p-hacked study tells us very little about reality apart from our ability to deceive ourselves.’

Chambers’ third sin is the umbrella one of ‘unreliability’, which includes more on psychology’s contempt for replication, among other sub-sins. A particularly gob-smacking one is the lack of true statistical power in much psychological research. For example, samples are often too small for proper statistical analysis, making the conclusions drawn from them unsound: not only are positive findings falsely reported, but genuine discoveries are often missed.

Unreliability is concealed by sin no. 4, ‘data hoarding’. Unlike most other sciences, psychology doesn’t abide by the convention that the raw data from an experiment or study is made available for independent scrutiny, usually by being deposited in a public database. Instead, it is jealously guarded, psychologists even routinely (73 percent of the time, according to one survey) refusing requests to share it – and when they do they often impose gag orders on how the data can be used and reported. This is despite data sharing being a condition of publication in most psychological journals and part of the code of conduct of professional bodies such as the American Psychological Association: ‘Few psychologists, and least of all the APA, seem to care whether psychologists share their data or not.’

The lack of rigour and scrutiny generated by the previous sins facilitates the biggest sin of all, corruptibility - outright fraud through fabricating data. Although there have been several high-profile exposures - such as that in 2011 of Dutch social psychologist Diederick Stapel, who perpetrated one of science’s largest ever frauds, building a high-flying career on made-up data – it’s impossible to tell how common fraud is in psychology, given the weak controls and the fact that many institutions cover up any cheating that does come to light: often it’s the whistle-blowers whose careers suffer. Chambers summarises that ‘Falsifying data offers a low-risk, high-reward career strategy for scientists who, for whatever reason, lose their moral compass and sense of purpose.’

Sin no. 6 is ‘internment’, by which Chambers means the ‘culture of concealment’ that restricts information to those within the profession, for example through the astronomical subscription fees demanded by journals which makes them available only to the richest institutions, rendering them ‘telegraph lines between the windows of the ivory tower’. Although this doesn’t only apply to psychology, Chambers argues that it is more invidious because of psychology’s public role: ‘Psychological discoveries generate substantial public interest, are relevant to policy making, and are hugely dependent on public funding.’

Also shared with many other sciences is the final sin of bean counting, the ‘growing push toward weighing up the worth of individual academics and their research contributions based on various “metrics,” and then to use those metrics to award jobs and funding.’ Metrics include the number of papers published and the number of citations, a system that rewards researchers for producing many low-quality papers rather than fewer of high quality or significance.

In the final chapter Chambers looks at ways to solve these problems – chiefly through the pre-registration of papers (setting out the hypothesis in advance to eliminate HARKing), full sharing of data and measures to protect whistle-blowers - while also telling his own personal journey. He also sets out steps that individual researchers can take to improve their practices and to be aware of ‘our unconscious biases, fragile egos, and propensity to cut corners’.

Seven Deadly Sins gives a candid and honest account of a profession that Chambers clearly cares deeply about, seeing the important contribution it can and should make to society. His ultimate message is that ‘If we continue as we are then psychology will diminish as a reputable science and could very well disappear.’

Although aimed principally at the psychology profession, Chambers writes that the book is for ‘anyone who is interested in the practice and culture of science’. For the most part, he successfully balances writing for members of his profession while keeping it accessible to outsiders. The only parts where he wobbles are those dealing with statistics, which assume a familiarity with concepts and methods that, while being part of psychologists’ workaday skills (or perhaps not, as Chambers shows how many within the profession don’t understand what some of the figures mean in real terms) are rather esoteric to the general reader.

As one of those general readers, the message that I took away from the book is, quite simply, that none of psychology’s findings, as frequently reported in the media, can be trusted. As described here psychology is, if not quite (yet) a pseudoscience, then at least a rogue science.

This is particularly alarming given the way its pronouncements about individual and collective behaviour are used for political and public policy purposes. As Chambers notes, ‘Applications of psychology in public policy are many and varied, ranging from tackling challenges like obesity and climate change through to the design of traffic signs, persuading citizens to vote in elections, and encouraging people to join organ donor registries.’ He gives the example of the Behavioural Insights Team set up by the Cameron government in 2010 to apply psychological science to public policy, which, like many other official bodies, simply accepts the validity of the published research.

With my Magonian hat on, I was naturally interested in Chambers’ study from the perspective of the methodological and statistical criticisms customarily levelled at parapsychology. It turns out that a huge amount of ‘straight’ psychological research suffers from exactly the same faults - a clear case of double standards. Imagine, for example, the outcry if a parapsychologist was found to have p-hacked their results.

Given the lengths to which parapsychologists go to forestall such criticisms, wouldn’t it be ironic if their research, including the likes of Bem’s, turned out to be more reliable than the norm?

Clive Prince.

1 comment:

Terry the Censor said...: > ‘HARKing’ – Hypothesising After Results are Known

Something similar to what Bem has done. Failing to get the desired results, he sifts the data for patterns, forms an explanation that fits the "discovered" pattern, then declares this all is proof of something (something not tested for originally).

Journals SHOULD be ashamed of publishing such fraud.; 10.1.18

16 October 2017

NOT-SO-ORIGINAL SINS

1 comment: