Certifiable insanity

Stack of books on a wooden table in a library
Stack of books on a wooden table in a library
Data in societyPseudoscience

My generation of scholars has grown up with the idea of pre-prints. Back in ye olden days (say, 1990), if you had an idea, you wrote it down (presumably in, I don’t know, cuneiform or something) and maybe a year or three later, it would be somewhere in the Journal of Something or Other. But then along came the internet, and much more rapid research, and people started putting out their content before peer review, and now we have arXiv, medRxiv, bioRxiv, many other similar spinoffs, as well as viXra, which is wonderful if you want to read what is politely called ‘fringe science’ and less politely, ‘bullshit’.

In line with just how it should be, arXiv, medRxiv and the rest of the reputable preprint servers have put some pretty visible notices onto their papers. This is what it says above one of my recent papers (and tons of others):

This article is a preprint and has not been certified by peer review [what does this mean?]. It reports new medical research that has yet to be evaluated and so should not be used to guide clinical practice.

I am on board with all of this except the word certified. In line with my ability to grow tedious over the semantics of a single word, hard earned through years of study, I will devote the rest of this post to my objections to the word and the idea.

I’m all for peer review. I’ve been on both sides of the process, and people who use the shortcomings of peer review as an argument against the whole concept are throwing the baby out with the proverbial bathwater. On the other hand, I find that in a statement directed at a non-academic audience (because who in academia does not know a preprint is not peer reviewed?), the notion of ‘certification’ is odd at best, misleading at worst.

Storytime: how not to fail your driving practicals a third time

I’ve earned my driver’s license in Europe, where the system is a little different from the United States and much of the rest of the world. You take classes with a certified driving instructor, then you take a computerised theory test, you pass, then you take a given number of hours of driving with an instructor, and finally, the practical exam. The latter is administered by external examiners and is in many ways similar to an FAA checkride, but with a much lower chance of killing yourself and/or your examiner.1) Fail three times and you have to do a set number of hours of practice.

To cut a long story short, I flat out failed my first two attempts, and while failing once was generally regarded as normal (and a way to keep examiners in business, who were paid the princely sum of 5,000 Ft, about $15, for 45 minutes’ work), failing twice meant that it was rather advisable not to fail a third time. So I got some ‘advice’, which is something you should never do. True to the borderline failed state I was living in at the time, the ‘advice’ inevitably involved bribery of some sort, either of the examiner himself (a 20,000 Ft bank note, worth about $60, was considered the going rate for a pass), or of the examination coordinators, who could allocate you a more ‘lenient’ examiner.

Now, having never bribed anyone before, I was rather disinclined to start then. I did what was uniformly regarded as the ‘lawful stupid’ alternative, and just hoped I’d pass the third time around. My examiner turned out to be a kindly old man who fell asleep in the August heat, about ten minutes into the examination, and had to be gently jolted awake by the instructor sitting in the rear seat. Perhaps a little sheepish from the whole affair, he let me pass. Or maybe I did drive okay. I don’t know.

The point of the story is that peer review can be a lot like driving practicals in mildly corrupt Eastern Europe.2) Some earn it, some pay their way into it, some get lucky, some experience a combination of the aforegoing.

Lessons in disaster

There are some absolutely fabulous things that have passed peer review — and by fabulous, I mean ‘completely made up’. We’ve had public policy and public sentiment misled for years by a fraudulent study that took over a decade to retract, giving rise to the vaccines/autism meme in the process and earning its key authors considerable consulting fees. A paper listing Emmanuel Macron’s dog as a co-author (and not even in a very subtle way!) passed peer review without any of the peer reviewers going ‘WTF?’. And outside the hard sciences, the Sokal affair has proven things aren’t much better, if at all. Sometimes, reviewers (in this case, of a journal on diabetes care) are so specialised that they don’t notice that ‘Tai’s method’, which the author so modestly named after her good self, has reinvented integration, a good two centuries after Leibnitz and Newton.3) Predatory publishers have added to the picture, and not in a good way — but the problem is hardly limited to them. IOP Publishing, which isn’t exactly Beall’s List material, fell victim to a number of papers generated by SciGen, a context-free grammar based CS paper generator, following in the trails of Springer and IEEE a few years earlier. All of these are actual, reputable publishers.

There is an illusion that peer review is a binary variable — something is either peer reviewed, hence trustworthy, or not peer reviewed, hence lacking the imprimatur common to scientific works. The true problem is that peer review is a highly complex variable, and definitely not binary. There’s no use in pretending that a paper peer reviewed and published in, say, JAMA or the New England Journal of Medicine has the same authority as a paper (allegedly) peer reviewed by one of the thousands of predatory publications that are only after the author’s money in the form of APCs (Article Processing Charges). And even there, the process may be quite indeterminate.

The luck of the draw factors into it, as do academic and personal sympathies — they shouldn’t, but they do. The best safeguard against bias is increasing sample size, but to do so is expensive and often quite difficult. There are, to put it simply, more journals around than there are academics who have the time and the skills to do a good job as peer reviewers. Equally, many early-career academics approach peer review with inappropriate amounts of deference and restraint. “What the bleep is this bleep?” is — or ought to be — a perfectly valid response to a paper. So is “you, sir/ma’am, are barking mad”. Science would greatly benefit if those phrases appeared in peer reviews rather more often.

The end effect, remains, however, this: peer review is not a monolith. Journal A’s peer review might be radically different from Journal B, to the point of being quite different in quality. Yet another journal’s definition of peer review might well consist of little more than the receipt of a sum currently hovering around $2,000-$2,500. To treat peer review, thus, as some binary variable borders on delusion.

What else, then?

As I have said before, I do believe in peer review. I believe it can (still) be fixed, and it should be fixed, rather than discarded. There are those who believe peer review serves to enforce a rigid ideological orthodoxy — at least in the fields I am familiar with, this is abject nonsense. There are also those who wish to get rid of peer review to allow fringe theories the same podium as good science gets. That, too, is wrong. You can always start a blog to explain why time is a seven-dimensional hypercube.

At the same time, peer review will never be fixed if we continue to delude ourselves that it is a unitary standard. You absolutely shouldn’t make clinical decisions based on a preprint, but neither should you do so based on what’s in a journal that published a paper by the French president’s dog and a pangolin. Of course, this is no real help for people who just want to know what science says. Science is a process, and last week’s science may be quite different from this week’s. The cure for misreported science is not perpetuating the illusion that peer review is a cure-all, but getting people to report on science who can read a paper, understand it and critically examine it.

Peer review ‘certifies’, if anything, a process — a process that can be done well or badly. Even reputable publishers mess up from time to time (Retraction Watch recently documented a journal published by Elsevier, one of the world’s most reputable publishing companies, flagging over 400 articles for various reasons — including the involvement of ‘paper mills’). Some peer reviewed literature is sound science, and a (hopefully small) minority is pure bloody nonsense. Using the language of ‘certification’ gives the misleading impression of a substantive, rather than formal guarantee of correctness.

And that would, of course, be certifiable insanity.

References[+]

References
1 Contrary to rumour, fatal accidents during initial checkrides are pretty much nonexistent. As soon as you have earned the now-virtual but once fashionably pink FAA form informing you of your abject failure, the examiner may take control and RTB.
2 As opposed to ‘overwhelmingly corrupt Eastern Europe’, where failing to pay a bribe ensures test failure just out of sheer survival imperative.
3 Or, to be more specific, she reinvented the trapezoidal method, which according to Ossendrijver (2016) has been known since 50 BCE or thereabouts. In ancient Babylon. Wrap your heads around that.
I'm a data scientist and computational epidemiologist focusing on the intersection of public health, data science and artificial intelligence.

You may also like

Leave a Reply