# Bayesian reasoning in clinical diagnostics: a primer.

We know, from the source of eternal wisdom that is Saturday Morning Breakfast Cereal, that insufficient math education is the basis of the entire Western economy.1 This makes Bayesian logic and reasoning about probabilities almost like a dark art, a well-kept secret that only a few seem to know (and it shouldn’t be… but that’s a different story). This weird-wonderful argument, reflecting a much-reiterated meme about vaccines and vaccine efficacy, is a good example:

The argument, here, in case you are not familiar with the latest in anti-vaccination fallacies, is that vaccines don’t work, and they have not reduced the incidence of vaccine-preventable diseases. Rather, if a person is vaccinated for, say, measles, then despite displaying clinical signs of measles, he will be considered to have a different disease, and therefore all disease statistics proving the efficacy of vaccines are wrong. Now, that’s clearly nonsense, but it highlights one interesting point, one that has a massive bearing on computational systems drawing conclusions from evidence: namely, the internal Bayesian logic of the diagnostic process.

Which, incidentally, is the most important thing that they didn’t teach you in school. Bayesian logic, that is. Shockingly, they don’t even teach much of it in medical school unless you do research, and even there it’s seen as a predictive method, not a tool to make sense of analytical process. Which is a pity. The reason why idiotic arguments like the above by @Cattlechildren proliferate is that physicians have been taught how to diagnose well, but never how to explain and reason about the diagnostic process. This was true for the generations before me, and is more or less true for those still in med school today. What is often covered up with nebulous concepts like ‘clinical experience’ is in fact solid Bayesian reasoning. Knowing the mathematical fundamentals of the thought process you are using day to day, and which help you make the right decisions every day in the clinic, helps you reason about it, find weak points, answer challenges and respond to them. For this reason, my highest hope is that as many MDs, epidemiologists, med students, RNs, NPs and other clinical decision-makers will engage with this topic, even if it’s a little long. I promise, it’s worth it.

## Some basic ideas about probability

In probability, an event, usually denoted with a capital and customarily starting at $A$ (I have no idea why, as it makes things only more confusing!), is any outcome or incidence that we’re interested in – as long as they’re binary, that is, they either happen or don’t happen, and discrete, that is, there’s a clear definition for it, so that we can decide if it’s happened or not – no half-way events for now.2 In other words, an event can’t happen and not happen at the same time. Or, to get used to the notation of conditionality, $p(A \mid \neg A) = 0$.3 A thing cannot be both true and false.

Now, we may be interested in how likely it is for an event to happen if another event happens: how likely is $A$ if $B$ holds true? This is denoted as $p(A|B)$, and for now, the most important thing to keep in mind about it is that it is not necessarily the same as $p(B|A)$!4

Bayesian logic deals with the notion of conditional probabilities – in other words, the probability of one event, given another.5 It is one of the most widely misunderstood part of probability, yet it is crucial to understand to our own idea of the way we reason about things.

Just to understand how important this is, let us consider a classic example.

## Case study 1: speed cameras

Your local authority is broke. And so, it does what local authorities do when they’re broke: play poker with the borough credit card set up a bunch of speed cameras and fine drivers. Over this particular stretch of road, the speed limit is 60mph.

According to the manufacturer, the speed cameras are very sensitive, but not very specific. In other words, they never falsely indicate that a driver was below the speed limit, but they may falsely indicate that the driver was above it, in about 3% of the cases (the false positive rate).

One morning, you’re greeted by a message in your postbox, notifying you that you’ve driven too fast and fining you a rather respectable amount of cash. What is the probability that you have indeed driven too fast?

You may feel inclined to blurt out 97%. That, in fact, is wrong.

### Explanation

It’s rather counter-intuitive at first to understand why, until we consider the problem in formal terms. We know the probability $p(A|\not B)$, that is, the probability of being snapped ($A$) even though you were not speeding ($\not B$). But what the question asks is what the likelihood that you were, in fact, speeding ($B$) given the fact that you were snapped ($A$). And as we have learned, the conditional probability operator is not commutative, that is, $p(A|B)$ is not necessarily the same as $p(B|A)$.

Why is that the case? Because base rates matter. In other words, the probabilities of $A$ and $B$, in and of themselves, are material. Consider, for a moment, the unlikely scenario of living in that mythical wonderland of law-abiding citizens where nobody speeds. Then, it does not matter how many drivers are snapped – all of them are false positives, and thus $p(B|A)$, the probability of speeding ($B$) given that one got snapped by a speed camera ($A$), is actually zero.

In other words, if we want to reverse the conditional operator, we need to make allowances for the ‘base frequency’, the ordinary frequency with which each event occurs on its own. To overcome base frequency neglect,6 we have a mathematical tool, courtesy of the good Revd. Thomas Bayes, who sayeth that, verily,

$latex p(B \mid A) = \frac{p(A \mid B) p(B)}{p(A)} Or, in words: if you want to reverse the probabilities, you will have to take the base rates of each event into account. If what we know is the likelihood that you were not speeding if you were snapped and what we’re interested in is the likelihood that someone getting snapped is indeed speeding, we’ll need to know a few more things. ### Case study 1: Speed cameras – continued • We know that the speed cameras have a Type II (false negative) error rate of zero – in other words, if you are speeding ($B$), you are guaranteed to get snapped ($A$) – thus,$p(A \mid B)$is 1. • We also know from the Highway Authority, who were using a different and more accurate measurement system, that approximately one in 1,000 drivers is speeding ($p(B) = 0.001$). • Finally, we know that of 1,000 drivers, 31 will be snapped – the one speeder and 3% accounting for the false positive rate –, yielding $p(A) = 0.031$. Putting that into our equation, $p(B|A) = \frac{p(A \mid B) p(B)}{p(A)} = \frac{1 \cdot 0.001}{0.031} = 0.032$ In other words, the likelihood that we indeed did exceed the speed limit is just barely north of 3%. That’s a far cry from the ‘intuitive’ answer of 97% (quite accidentally, it’s almost the inverse). ## Diagnostics, probabilities and Bayesian logic The procedure of medical diagnostics is ultimately a relatively simple algorithm: 1. create a list of possibilities, however remote (the process of differential diagnostics), 2. order them in order of likelihood, 3. update priors as you run tests.7 From a statistical perspective, this is implemented as follows. 1. We begin by running a number of tests, specifically $m$ of them. It is assumed that the tests are independent from each other, i.e. the value of one does not affect the value of another. Let $R_j$ denote the results of test$j \leq m$. 1. For each test, we need to iterate over all our differentials $D_{i \ldots n}$, and determine the probability of each in light of the new evidence, i.e.$latex p(D_i \mid R_j).
2. So, let’s take the results of test $j$ that yielded the results $R_j$, and the putative diagnosis $D_i$. What we’re interested in is $p(D_i \mid R_j)$, that is, the probability of the putative diagnosis given the new evidence. Or, to use Bayesian lingo, we are updating our prior: we had a previous probability assigned to $D_i$, which may have been a uniform probability or some other probability, and we are now updating it – seeing how likely it is given the new evidence, getting what is referred to as a posterior.8
3. To calculate the posterior $P(D_i | R_j)$, we need to know three things – the sensitivity and specificity of the test $j$ (I’ll call these $S^+_j$ and $S^-_j$, respectively), the overall incidence of $D_i$,9 and the overall incidence of the particular result $R_j$.
4. Plugging these variables into our beloved Bayesian formula, we get $p(D_i \mid R_j) = \frac{p(R_j \mid D_i) p(D_i)}{p(R_j)}$.
5. We know that $p(R_j \mid D_i)$, that is, the probability that someone will test a particular way if they do have the condition $D_i$, is connected to sensitivity and specificity: if $R_j$ is supposed to be positive if the patient has $D_i$, then $p(R_j \mid D_i) = S^-_j$ (sensitivity), whereas if the test is supposed to be negative if the patient has $D_i$, then $p(R_j \mid D_i) = S^+_j$ (specificity).
6. We also know, or are supposed to know, the overall incidence of $D_i$ and the probability of a particular outcome, $R_j$. With that, we can update our prior for $D_i \mid R_j$.
2. We iterate over each of the tests, updating the priors every time new evidence comes in.

This may sound daunting and highly mathematical, but in fact most physicians have this down to an innate skill, so much so that when I explained this to a group of FY2 doctors, they couldn’t believe it – until they thought about how they thought. And that’s a key issue here: thinking about the way we arrive at results is important, because they are the bedrock of what we need to make those results intelligible to others.

## Case study 2: ATA testing for coeliac disease

For a worked example of this in the diagnosis of coeliac disease, check Notebook 1: ATA case study. It puts things in the context of sensitivity and specificity in medical testing, and is in many ways quite similar to the above example, except here, we’re working with a real-world test with real-world uncertainties.

There are several ways of testing for coeliac disease, a metabolic disorder in which the body responds to gluten proteins (gliadins and glutenins) in wheats, wheat hybrids, barley, oats and rye. One diagnostic approach looks at genetic markers in the HLA-DQ (Human Leukocyte Antigen type DQ), part of the MHC (Major Histocompatibility Complex) Class II receptor system. Genetic testing for a particular haplotype of the HLA-DQ2 gene, called DQ2.5, can lead to a diagnosis in most patients. Unfortunately, it’s slow and expensive. Another test, a colonoscopic biopsy of the intestines, looks at the intestinal villi, short protrusions (about 1mm long) into the intestine, for tell-tale damage – but this test is unpleasant, possibly painful and costly.

So, a more frequent way is by looking for evidence of an autoantibody called anti-tissue transglutaminase antibody (ATA) – unrelated to this gene, sadly. ATA testing is cheap and cheerful, and relatively good, with a sensitivity ($S^+_{ATA}$) of 85% and specificity ($S^+_{ATA}$) of 97%.10 We also know the rough probability of a sample being from someone who actually has coeliac disease – for a referral lab, it’s about 1%.

Let’s consider the following case study. A patient gets tested for coeliac disease using the ATA test described above. Depending on whether the test is positive or negative, what are the chances she has coeliac disease?

If you’ve read the notebook, you know by now that the probability of having coeliac disease if testing positive is around 22%, or a little better than one-fifth. And from the visualisation to the left, you could see that small incremental improvements in specificity would yield a lot more increase in accuracy (marginal accuracy gain) than increases in sensitivity.

While quite simple, this is a good case study because it emphasises a few essential things about Bayesian reasoning:

• Always know your baselines. In this case, we took a baseline of 1%, even though the average incidence of coeliac disease in the population is closer to about 0.25% of that. Why? Because we don’t spot-test people for coeliac disease. People who do get tested get tested because they exhibit symptoms that may or may not be coeliac disease, and by definition they have a higher prevalence11 of coeliac disease. The factor is, of course, entirely imaginary – you would, normally, need to know or have a way to figure out the true baseline values.
• Use independent baselines. It is absolutely crucial to make sure that you do not get the baselines from your own measurement process. In this case, for instance, the incidence of coeliac disease should not be calculated by reference to your own lab’s number of positive tests divided by total tests. This merely allows for further proliferation of false positives and negatives, however minuscule their effect. A good way is to do follow-up studies, checking how many of the patients tested positive or negative for ATA were further tested using other methodologies, many of which may be more reliable, and calculate the proportion of actual cases coming through your door by reference to that.

## Case study 3: Vaccines in differential diagnosis

This case is slightly different, as we are going to compare two different scenarios. Both concern $D_{VPD}$, a somewhat contrived vaccine-preventable illness. $D_{VPD}$ produces a very particular symptom or symptom set, $S$, and produces this symptom or symptom set in every case, without fail.12 The question is – how does the vaccination status affect the differential diagnosis of two identical patients,13 presenting with the same symptoms $S$, one of whom is unvaccinated?

It has been a regrettably enduring trope of the anti-vaccination movement that because doctors believe vaccines work, they will not diagnose a patient with a vaccine-preventable disease (VPD), simply striking it off the differential diagnosis or substitute a different diagnosis for it.14 The reality is explored in this notebook, which compares two scenarios, of the same condition, with two persons with the sole difference of vaccination status. That difference makes a massive – about 7,800x – difference between the likelihood of the vaccinated and the unvaccinated person having the disease. The result is that a 7,800 times less likely outcome slides down the differential. As NZ paediatrician Dr Greenhouse (@greenhousemd) noted in the tweet, “it’s good medical care”. In the words of British economist John Maynard Keynes,15 “when the facts change, I change my mind”. And so do diagnosticians.

Quite absolutely simply put: it’s not an exclusion or fudging data or in any sensible way proof that “no vaccine in history has ever worked”. It’s quite simply a reflection of the reality that if in a population a condition is almost 8,000 times less likely, then, yes, other more frequent conditions push ahead.

## Lessons learned

Bayesian analysis of the diagnostic procedure allows not only increased clarity about what one is doing as a clinician. Rather, it allows the full panoply of tools available to mathematical and logical reasoning to investigate claims, objections and contentions – and like in the case of the alleged non-diagnosis of vaccines, discard them.

The most powerful tool anyone who utilises any process of structured clinical reasoning – be it clinical reasoning in diagnostics, algorithmic analysis, detective work or intelligence analysis – is to be able to formally reason about one’s own toolkit of structured processes. It is my hope that if you’ve never thought about your clinical diagnostic process in these terms, you will now be able to see a new facet of it.

References   [ + ]

 1 ↑ The basis of non-Western economies tends to be worse. That’s about as much as Western economies have going for them. See: Venezuela and the DPRK. 2 ↑ There’s a whole branch of probability that deals with continuous probabilities, but discrete probabilities are crazy enough for the time being. 3 ↑ Read: The probability of A given not-A is zero. A being any arbitrary event: the stock market crashing, the temperature tomorrow exceeding 30ºC, &. 4 ↑ In other words, it may be the same, but that’s pure accident. Mathematically, they’re almost always different. 5 ↑ It’s tempting to assume that this implies causation, or that the second event must temporally succeed the first, but none of those are implied, and in fact only serve to confuse things more. 6 ↑ You will also hear this referred to as ‘base rate neglect’ or ‘base rate fallacy’. As an epidemiologist, ‘rate’ has a specific meaning for us – it generally means events over a span of time. It’s not a rate unless it’s necessarily over time. I know, we’re pedantic like that. 7 ↑ This presupposes that these tests are independent of each other, like observations of a random variable. They generally aren’t – for instance, we run the acute phase protein CRP, W/ESR (another acute phase marker) and a WBC count, but these are typically not independent from each other. In such cases, it’s legitimate to use $B = B_1 \cap B_2 \cap \ \ldots \cap B_n$$B = B_1 \cap B_2 \cap \ \ldots \cap B_n$ or, as my preferred notation goes, $B = \bigcap^n_{k=1} B_k$$B = \bigcap^n_{k=1} B_k$. I know ‘updating’ is the core mantra of Bayesianism, but knowing what to update and knowing where to simply calculate the conjoint probability is what experts in Bayesian reasoning rake in the big bucks for. 8 ↑ Note that a posterior from this step can, upon more new evidence, become the prior in the next round – the prior for $j$$j$ may be the inferred probability $p(D_i)$$p(D_i)$, but the prior for $j + 1$$j + 1$ is $p(D_i \mid R_j)$$p(D_i \mid R_j)$, and so on. More about multiple observations later. 9 ↑ It’s important to note that this is not necessarily the population incidence. For instance, the overall incidence and thus the relevant $D$$D$ for EBOV ($D_{EBOV}$$D_{EBOV}$) is going to be different for a haemorrhagic fever referral lab in Kinshasa and a county hospital microbiology lab in Michigan. 10 ↑ Lock, R.J. et al. (1999). IgA anti-tissue transglutaminase as a diagnostic marker of gluten sensitive enteropathy. J Clin Pathol 52(4):274-7. 11 ↑ More epidemiopedantry: ‘incidence’ refers to new cases over time, ‘prevalence’ refers to cases at a moment in time. 12 ↑ This is, of course, unrealistic. I will do a walkthrough of an example of multiple symptoms that each have an association with the illness in a later post. 13 ↑ It’s assumed gender is irrelevant to this disease. 14 ↑ Presumably hoping that refusing to diagnose a patient with diphtheria and instead diagnosing them with a throat staph infection will somehow get the patient okay enough that nobody will notice the insanely prominent pseudomembrane… 15 ↑ Or not…

# Automagic epi curve plotting: part I

As of 24 May 2018, the underlying data schema of the Github repo from which the epi curve plotter draws its data has changed. Therefore, a lot of the code had to be adjusted. The current code can be found here on Github. This also plots a classical epi curve.

One of the best resources during the 2013-16 West African Ebola outbreak was Caitlin RiversGithub repo, which was probably one of the best ways to stay up to date on the numbers. For the current outbreak, she has also set up a Github repo, with really frequent updates straight from the WHO’s DON data and the information from DRC Ministry of Public Health (MdlS) mailing list.1 Using R, I have set up a simple script that I only have to run every time I want a pre-defined visualisation of the current situation. I am usually doing this on a remote RStudio server, which makes matters quite easy for me to quickly generate data on the fly from RStudio.

### Obtaining the most recent data

Using the following little script, I grab the latest from the ebola-drc Github repo:

# Fetch most recent DRC data.
library(magrittr)
library(curl)
library(dplyr)

current_drc_data <- Sys.time() %>%
format("%d%H%M%S%b%Y") %>%
paste("raw_data/drc/", "drc-", ., ".csv", sep = "") %T>%
curl_fetch_disk("https://raw.githubusercontent.com/cmrivers/ebola_drc/master/drc/data.csv", .) %>%
read_csv()

This uses curl (the R implementation) to fetch the most recent data and save it as a timestamped2 file in the data folder I set up just for that purpose.3 Simply sourcing this script (source("fetch_drc_data.R")) should then load the current DRC dataset into the environment.4

### Data munging

We need to do a little data munging. First, we melt down the data frame using reshape2‘s melt function. Melting takes ‘wide’ data and converumnts it into ‘long’ data – for example, in our case, the original data had a row for each daily report for each health zone, and a column for the various combinations of confirmed/probable/suspected over cases/deaths. Melting the data frame down creates a variable type column (say, confirmed_deaths and a value column (giving the value, e.g. 3). Using lubridate,5 the dates are parsed, and the values are stored in a numeric format.

library(magrittr)
library(reshape2)
library(lubridate)

current_drc_data %<>% melt(value_name = "value", measure.vars = c("confirmed_cases", "confirmed_deaths", "probable_cases", "probable_deaths", "suspect_cases", "suspect_deaths", "ruled_out"))
current_drc_data$event_date <- lubridate::ymd(current_drc_data$event_date)
current_drc_data$report_date <- lubridate::ymd(current_drc_data$report_date)
current_drc_data$value <- as.numeric(current_drc_data$value)

Next, we drop ruled_out cases, as they play no significant role for the current visualisation.

## Issues

There are some glitches that we’re trying to iron out:

• Cartography and GIS tools glitch a little due to issues with PROJ.4.
• GPU/CUDA support is not implemented as this is not customarily used or provided on Linodes
• certbot and Let's Encrypt is not really supported yet, as our boxes are never directly public-facing, but you should most definitely find a way to put your server behind some form of SSL/TLS.
• Currently, only Ubuntu 16.04 LTS is supported, as it’s the most widely used version. CRAN does not yet support more recent versions yet, but these will be added once CRAN support is added.
New to Linodes? Linode is a great service offering a simple, uncomplicated way to set up a Linux server, just how you want it! 👍🏻 In fact, I’ve been using Linode servers for over four years, and have only good experiences with them. I keep recommending them to friends, and they’re all happy! So if this post was helpful, and you’re thinking of signing up for a Linode, would you consider doing so using my affiliate link? I run this blog as a free service and pay for hosting out of pocket, and I don’t do advertisements (and never will!) or paid reviews of things (like what, paid Tyvek reviews? “Best Ebola PPE”? Nope.) – so really, it would make me so happy if you could help out this way.

As always, do let me know how things work out for you! Feel free to leave comments and ideas below. If you’re getting value out of using this script, just let me know.

# Herd immunity: how it really works

There are few concepts as trivial yet as widely misunderstood as herd immunity. In a sense, it’s not all that surprising, because frankly, there’s something almost magical about it – herd immunity means that in a population, some people who are not or cannot be immunized continue to reap the benefit of immunization. On its own, this may even be counter-intuitive. And so, unsurprisingly, like many evidently true concepts, herd immunity has its malcontents – going so far as to condemn the very idea as a ‘CDC lie’ – never mind that the concept was first used in 1923, well before the CDC was established.1

Now, let’s ignore for a moment what Dr Humphries, a nephrologist-turned-homeopath with a penchant for being economical with the truth when not outright lying, has to say – not because she’s a quack but because she has not the most basic idea of epidemiology. Instead, let’s look at this alleged ‘myth’ to begin with.

### Herd immunity: the cold, hard maths

Our current understanding of herd immunity is actually a result of increasing understanding of population dynamics in epidemiology, towards the second half of the 20th century. There are, on the whole, two ways to explain it. Both are actually the same thing, and one can be derived from the other.

#### The simple explanation: effective $R_0$$R_0$ depletion

The simple explanation rests on a simplification that makes it possible to describe herd immunity in terms that are intelligible at the level of high school maths. In epidemiology, $R_0$ (pron. ‘arr-nought‘, like a pirate), describes the basic reproduction rate of an infectious disease.2 To put it at its most simplistic: $R_0$ is the number of cases produced by each case. The illustration on the side shows the index case (IDX) and the first two generations of an infection with $R_0 = 3$.

Now, $R_0$ is a theoretical variable. It is usually observationally estimated, and don’t take measures intended to reduce it into account. And that’s where it gets interesting.

Consider the following scenario, where a third of the population is vaccinated, denoted by dark black circles around the nodes representing them. One would expect that of the 13 persons, a third, i.e. about. 4 , would remain disease-free. But in fact, over half of the people will remain disease-free, including three who are not vaccinated. This is because the person in the previous generation did not pass on the pathogen to them. In other words, preventing spread, e.g. by vaccination or by quarantine, can affect and reduce $R_0$. Thus in this case, the effective $R_0$ was closer to 1.66 than 3 – almost halving the $R_0$ by vaccinating only a third of the population.

We also know that for infections where the patient either dies or recovers, the infection has a simple ecology: every case must be ‘replaced’. In other words, if the effective $R_0$ falls below 1, the infection will eventually peter out. This happens quite often when everyone in a population is dead or immune after an infection has burned through it (more about that later).

Thus, the infection will be sustainable if and only if

$R_{0} \geq 1$

Under the assumption of a 100% efficient vaccine, the threshold value $\bar{p_V}$ after which the infection will no longer be able to sustain itself is calculated as

$\bar{p_V} = 1 - \frac{1}{R_0}$

Adjusting for vaccine efficacy, $E_V$, which is usually less than 100%, we get

$\bar{p_V} = \frac{1-\frac{1}{R_0}}{E_V} = \frac{R_0 - 1}{R_0 E_V}$

For a worked example, let’s consider measles. Measles has an $R_0$ around 15 (although a much higher value has been observed in the past, up to 30, in some communities), and the measles vaccine is about 96% effective. What percentage of the population needs to be vaccinated? Let’s consider $\bar{p_V}$, the minimum or threshold value above which herd immunity is effective:

$\bar{p_V} = \frac{R_0 - 1}{R_0 E_V} = \frac{15-1}{15 \cdot 0.96} = \frac{14}{14.4} \approx 97.22\%$

#### The more complex explanation: $\mathcal{SIR}$$\mathcal{SIR}$ models

Note: this is somewhat complex maths and is generally not recommended unless you’re a masochist and/or comfortable with calculus and differential equations. It does give you a more nuanced picture of matters, but is not necessary to understand the whole of the argumentation. So feel free to skip it.

The slightly more complex explanation relies on a three-compartment model, in which the population is allotted into one of three compartments: $\mathcal{S}$usceptible, $\mathcal{I}$nfectious and $\mathcal{R}$ecovered. This model makes certain assumptions, such as that persons are infectious from the moment they’re exposed and that once they recover, they’re immune. There are various twists on the idea of a multicompartment model that takes into account the fact that this is not true for every disease, but the overall idea is the same.3 In general, multicompartment models begin with everybody susceptible, and a seed population of infectious subjects. Vaccination in such models is usually accounted for by treating them as ‘recovered’, and thus immune, from $t = 0$ onwards.

Given an invariant population (i.e. it is assumed that no births, deaths or migration occurs), the population can be described as consisting of the sum of the mutually exclusive compartments: $P = \mathcal{S}(t) + \mathcal{I}(t) + \mathcal{R}(t)$. For the same reason, the total change is invariant over time, i.e.

$\frac{d \mathcal{S}}{d t} + \frac{d \mathcal{I}}{d t} + \frac{d \mathcal{R}}{d t} = 0$

Under this assumption of a closed system, we can relate the volumes of each of the compartment to the transition probabilities $\beta$ (from $\mathcal{S}$ to $\mathcal{I}$) and $\gamma$ (from $\mathcal{I}$ to $\mathcal{R}$), so that:

$\frac{d \mathcal{S}}{d t} = - \frac{\beta \mathcal{I} \mathcal{S}}{P}$

$\frac{d \mathcal{I}}{d t} = \frac{\beta \mathcal{I} \mathcal{S}}{P} - \gamma \mathcal{I}$

$\frac{d \mathcal{R}}{d t} = \gamma \mathcal{I}$

Incidentally, in case you were wondering how this connects to the previous explanation: $R_0 = \frac{\beta}{\gamma}$.

Now, let us consider the end of the infection. If $\mathcal{S}$ is reduced sufficiently, the disease will cease to be viable. This does not need every individual to be recovered or immune, however, as is evident from dividing the first by the third differential equation and integrating and substituting $R_0$, which yields

$\displaystyle \mathcal{S}(t) = \mathcal{S}(0) e^{\frac{-R_0 (\mathcal{R}(t)-\mathcal{R}(0))}{P}}$

Substituting this in, the limit of $\mathcal{R}$, as $t$ approaches infinity, is

$\displaystyle \lim_{t\to\infty}\mathcal{R}(t) = P - \lim_{t\to\infty}\mathcal{S}(t) = P - \mathcal{S}(0) e^{\frac{-R_0 (\mathcal{R}(t)-\mathcal{R}(0))}{P}}$

From the latter, it is evident that

$\displaystyle \lim_{t\to\infty}\mathcal{S}(t) \neq 0 \mid \mathcal{S}(0) \neq 0$

In other words, once the infection has burned out, there will still be some individuals who are not immune, not immunised and not vaccinated. These are the individuals protected by herd immunity. This is a pretty elegant explanation for why herd immunity happens and how it works. There are three points to take away from this.

First, herd immunity is not unique to vaccination. The above finding in relation to the nonzero limit of $\lim_{t\to\infty}\mathcal{S}(t)$ holds as long as $\mathcal{S}(0) \neq 0$, but regardless of what $\mathcal{R}(0)$ is. In other words, herd immunity is not something artificial.

Two, for any $i \in \mathcal{S}$ (that is, any susceptible person) at time $t$, the probability of which compartment he will be in at $t+1$ depends on whom he encounters. That, statistically, depends on the relative sizes of the compartments. In this model, the assumption is that the sample $i$ will encounter will reflect the relative proportions of the individual compartments’ sizes. Thus if $i$ meets $n$ people at time $t$, each compartment will be proportionally represented, i.e. for any compartment $\mathcal{C}$, the proportion will be $\frac{\mathcal{C}(t)}{P-1}$ for all $\mathcal{C} \neq \mathcal{S}$, for which the proportion will be $\frac{\mathcal{S}(t) - 1}{P - 1}$, since one cannot meet oneself. Given that the transition probability $\beta_{i}(t)$ is assumed to equal the probability of meeting at least one element of $\mathcal{I}$, the following can be said. $i$‘s risk of infection depends on the relationship of $n$ and $\mathcal{I}(t)$, so that $i$ is likely to get infected if

$\displaystyle n \frac{\mathcal{I}(t)}{P-1} \geq 1$

This elucidates two risk factors clearly, and the way to reduce them: reduce interactions (quarantine/self-quarantine), thereby reducing $n$, and reduce the proportion of infectious cases ($\frac{\mathcal{I}(t)}{P-1}$). The latter is where herd immunity from immunisation comes in. Recall that for a constant $n$, $i$‘s risk of infection at $t$ rises as $\mathcal{I}(t)$ rises.4 Recall also that while susceptible cases can turn into infectious cases, recovered (or vaccinated) cases cannot. And so, as $\mathcal{R}(0)$ converges to $P-1$,5 $i$‘s risk of infection at any time $t$, denoted by $\beta_{i}(t)$, falls. In other words,

$\displaystyle \lim_{\mathcal{R}(0) \to P-1} \beta_{i}(t) = 0$

Or to put it simply: the more are vaccinated at the start, the lower the probability, all things being equal, to meet someone who can pass on the infection.6

A final point to note is that this is primarily a model of statistical dynamics, and deals with average probabilities. It does not – it cannot – take account of facts like that some some susceptible people are just darn unlucky, and bump into a flock of unvaccinated, shiny-eyed snowflakes. Equally, in some places, susceptible people and infected people congregate, creating a viral breeding ground, also known as a Waldorf school. There are agent based models, which are basically attempts at brute force hacking reality, that can take account of such disparities. The takeaway is that herd immunity does not mean no susceptible individual will get infected. What it does mean is that their probability of getting infected is going to be significantly lower, for two reasons. First given a constant number of encounters ($n$), the likelihood of one of them being with an infectious individual is going to be much lower. More importantly, however, because of herd immunity, the disease is going to be able to persist in the population for a far shorter time – eventually it will burn through the small number of ‘accessible’ susceptible persons. Since the cumulative risk $\beta_{i}^T$ for $i \in \mathcal{S}$ for an infection that dies out after time $T$ is defined as

$\beta_i^T = \displaystyle \int\limits_0^T \beta_{i}(t) \, \mathrm{d}t$

– the sooner the infection dies out, the smaller the likelihood that $i$ will be infected. With that mathematical basis, let’s tackle a few of the myths about herd immunity.

### Myth #1: herd immunity only works with naturally acquired immunity

This argument goes roughly along the following lines: herd immunity does exist, but it only exists if and where the immunity is acquired the ‘natural’ way, i.e. by surviving the disease. Case in point:

# MedDRA + VAERS: A marriage made in hell?

This post is a Golden DDoS Award winner

So far, this blog was DDoS’d only three times within 24 hours of its publication. That deserves a prize.

Quick: what do a broken femur, Henoch-Schönlein purpura, fainting, an expired vaccine and a healthy childbirth have in common? If your answer was “they’re all valid MedDRA codes”, you’re doing pretty well. If you, from that, deduced that they all can be logged on VAERS as adverse effects of vaccination, you’re exceeding expectations. And if you also realise that the idea that Jane got an expired HPV vaccine, and as a consequence broke her femur, developed Henoch-Schönlein purpura, and suddenly gave birth to a healthy baby boy is completely idiotic and yet can be logged on VAERS, you’re getting where I’m going.

MedDRA is a medical nomenclature specifically developed for the purposes of pharmacovigilance. The idea is, actually, not dreadful – there are some things in a usual medical nomenclature like ICD-10 that are not appropriate for a nomenclature used for pharmacovigilance reporting (V97.33: sucked into jet engine comes to my mind), and then there are things that are specific to pharmacovigilance, such as “oh shoot, that was not supposed to go up his bum!” (MedDRA 10013659: vaccine administered at inappropriate site), “we overdosed little Johnny on the flu vaccine!” (MedDRA 10000381: drug overdose, accidental) and other joys that generally do only happen in the context of pharmacovigilance. So far, so good.

At the same time, MedDRA is non-hierarchical, at least on the coding level. Thus, while the ICD code V97.33 tells you that you’re dealing with an external cause of mortality and morbidity (V and Y codes), specifically air and space transport (V95-97), more specifically ‘other’ specific air transport accidents, specifically getting sucked into a jet engine (V97.33), there’s no way to extract from MedDRA 10000381 what the hell we’re dealing with. Not only do we not know if it’s a test result, a procedure, a test or a disease, we are hopelessly lost as to figuring out what larger categories it belongs to. To make matters worse, MedDRA is proprietary – which in and of itself is offensive to the extreme to the idea of open research on VAERS and other public databases: a public database should not rely on proprietary encoding! -, and it lacks the inherent logic of ICD-10. Consider the encoding of the clinical diagnosis of unilateral headache in both:

We know that an ICD code beginning with F will be something psychiatric and G will be neurological, and from that alone we can get some easy analytical approaches (a popular one is looking at billed codes and drilling down by hierarchical level of ICD-10 codes, something in which the ICD-10 is vastly superior to its predecessor). MedDRA, alas, does not help us such.

## Garbage in, garbage out

OK, so we’ve got a nomenclature where the codes for needlestick injury, death, pneumonia, congenital myopathy and a CBC look all the same. That’s already bad enough. It gets worse when you can enter any and all of these into the one single field. Meet VAERS.

The idea of VAERS is to allow physicians, non-physicians and ‘members of the public’ to report incidents. These are then coded by the CDC and depending on seriousness, they may or may not be investigated (all reports that are regarded as ‘serious’ are investigated, according to the CDC). The problem is that this approach is susceptible to three particular vulnerabilities:

• The single field problem: VAERS has a single field for ‘symptoms’. Everything’s a symptom. This includes pre-existing conditions, new onset conditions, vaccination errors, lab tests (not merely results, just the tests themselves!), interventions (without specifying if they’re before or after the vaccine), and so on. There is also no way to filter out factors that definitely have nothing to do with the vaccine, such as a pre-existing birth defect. The History/Allergies field is not coded.
• The coding problem: what gets coded and what does not is sometimes imperfect. This being a human process, it’s impossible to expect perfection, but the ramifications to this to certain methods of analysis are immense. For instance. if there are 100 cases of uncontrollable vomiting, that may be a signal. But if half of those are coded as ‘gastrointestinal disorder’ (also an existing code), you have two values of 50, neither of which may end up being a signal.
• The issue of multiple coding: because MedDRA is non-hierarchical, it is not possible to normalise at a higher level (say, with ICD-10 codes, at chapter or block level), and it is not clear if two codes are hierarchically related. In ICD-10, if a record contains I07 (rheumatic tricuspid valve disease) and I07.2 (tricuspid stenosis with tricuspid insufficiency), one can decide to retain the more specific or the less specific entry, depending on intended purpose of the analysis.

In the following, I will demonstrate each of these based on randomly selected reports from VAERS.

### The Single Field Problem (SFP)

The core of the SFP is that there is only one codeable field, ‘symptoms’.

VAERS ID 375693-1 involves a report, in which the patient claims she developed, between the first and second round of Gardasil,

severe stomach pain, cramping, and burning that lasted weeks. Muscle aches and overall feeling of not being well. In August 2009 patient had flu like symptoms, anxiety, depression, fatigue, ulcers, acne, overall feeling of illness or impending death.

Below is the patient’s symptom transposition into MedDRA entities (under Symptoms):

The above example shows the mixture of symptoms, diagnostic procedures and diagnostic entities that are coded in the ‘Symptoms’ field. The principal problem with this is that when considering mass correlations (all drugs vs all symptoms, for instance), this system would treat a blood test just as much as a contributor to a safety signal as anxiety or myalgia, which might be true issues, or depression, which is a true diagnosis. Unfiltered, this makes VAERS effectively useless for market basket analysis based (cooccurrence frequency) analyses.

Consider for instance, that $PRR$ is calculated as

$PRR_{V,R} = \frac{\Sigma (R \mid V) \/ \Sigma (V)}{\Sigma (R \mid \neg V) \/ \Sigma (\neg V)} = \frac{\Sigma (R \mid V)}{\Sigma (V)} \cdot \frac{\Sigma (\neg V)}{\Sigma (R \mid \neg V)}$

where $V$ denotes the vaccine of interest, $R$ denotes the reaction of interest, and the $\Sigma$ operator denotes the sum of rows or columns that fulfill the requisite criteria (a more detailed, matrix-based version of this equation is presented here). But if $\{R\}$, the set of all $R$, contains not merely diagnoses but also various ‘non-diagnoses’, the PRR calculation will be distorted. For constant $V$ and an unduly large $R$, the values computationally obtained from the VAERS data that ought to be $\Sigma(R \mid V)$ and $\Sigma(R \mid \neg V)$ will both be inaccurately inflated. This will yield inaccurate final results.

Just how bad IS this problem? About 30% bad, if not more. A manual tagging of the top 1,000 symptoms (by $N$, i.e. by the number of occurrences) was used as an estimate for how many of the diagnostic entities do not disclose an actual problem with the vaccine.

According to the survey of the top 1,000 codes, only a little more than 70% of the codes themselves disclose a relevant issue with the vaccine. In other words, almost a third of disclosed symptoms must be pruned, and these cannot be categorically pruned because unlike ICD-10, MedDRA does not disclose hierarchies based on which such pruning would be possible. As far as the use of MedDRA goes, this alone should be a complete disaster.

Again, for effect: a third of the codes do not disclose an actual side effect of the medication. These are not separate or identifiable in any way other than manually classifying them and seeing whether they disclose an actual side effect or just an ancillary issue. Pharmacovigilance relies on accurate source data, and VAERS is not set up, with its current use of MedDRA, to deliver that.

### The coding problem

Once a VAERS report is received, it is MedDRA coded at the CDC. Now, no manual coding is perfect, but that’s not the point here. The problem is that a MedDRA code does not, in and of itself,  indicate the level of detail it holds. For instance, 10025169 and 10021881 look all alike, where in fact the first is a lowest-level entity (an LLT – Lower-Level Term – in MedDRA lingo) representing Lyme disease, while the former is the top-level class (SOC – System Organ Class) corresponding to infectious diseases. What this means is that once we see a MedDRA coded entity as its code, we don’t know what level of specificity we are dealing with.

The problem gets worse with named entities. You see, MedDRA has a ‘leaf’ structure: every branch must terminate in one or more (usually one) LLT. Often enough, LLTs have the same name as their parent PT, so you get PT Lyme disease and LLT Lyme disease. Not that it terrifically matters for most applications, but when you see only the verbose output, as is the case in VAERS, you don’t know if this is a PT, an LLT, or, God forbid, a higher level concept with a similar name.

Finally, to put the cherry on top of the cake, where a PT is also the LLT, they have the same code. So for Lyme disease, the PT and LLT both have the code 10025169. I’m sure this seemed like a good idea at the time.

### The issue of multiple coding

As this has been touched upon previously, because MedDRA lacks an inherent hierarchy, a code cannot be converted into its next upper level without using a lookup table, whereas with, say, ICD-10, one can simply normalise to the chapter and block (the ‘part left of the dot’). More problematically, however, the same code may be a PT or an LLT, as is the case for Lyme disease (10025169).

Let’s look at this formally. Let the operator $\in^*$ denote membership under the transitive closure of the set membership relation, so that

1. if $x \in A$, then $x \in^* A$,
2. if $x \in A$ and $A \subseteq B$, then $x \in^* B$.

and so on, recursively, ad infinitum. Let furthermore $\in^*_{m}$ denote the depth of recursion, so that

1. for $x \in A$:  $x \in^*_{0} A$,
2. for $x \in A \mid A \subseteq B$:  $x \in^*_{1} B$,

and, once again, so on, recursively, ad infinitum.

Then let a coding scheme $\{S_{1...n}\}$ exhibit the Definite Degree of Transitiveness (DDoT) property iff (if and only if) for any $S_m \mid m \leq n$, there exists exactly one $p$ for which it is true that $S_m \in^*_{p} S$.

Or, in other words, two codes $S_q, S_r \mid q, r \leq n$, may not be representable identically if $p_q \neq p_r$. Less formally: two codes on different levels may not be identical. This is clearly violated in MedDRA, as the example below shows.

### Bonus: the ethical problem

To me as a public health researcher, there is a huge ethical problem with the use of MedDRA in VAERS. I believe very strongly in open data and in the openness of biomedical information. I’m not alone: for better or worse, the wealth – terabytes upon terabytes – of biomedical data, genetics, X-ray crystallography, models, sequences  prove that if I’m a dreamer, I’m not the only one.

Which is why it’s little short of an insult to the public that a pharmacovigilance system is using a proprietary encoding model.

Downloads from VAERS, of course, provide the verbose names of the conditions or symptoms, but not what hierarchical level they are, nor what structure they are on. For that, unless you are a regulatory authority or a ‘non-profit’ or ‘non-commercial’ (which would already exclude a blogger who unlike me has ads on their blog to pay for hosting, or indeed most individual researchers, who by their nature could not provide the documentation to prove they aren’t making any money), you have to shell out some serious money.

Worse, the ‘non-profit’ definition does not include a non-profit research institution or an individual non-profit researcher, or any of the research bodies that are not medical libraries or affiliated with educational institutions but are funded by third party non-profit funding:

There is something rotten with the use of MedDRA, and it’s not just how unsuitable it is for the purpose, it is also the sheer obscenity of a public database of grave public interest being tied to a (vastly unsuitable and flawed, as I hope it has been demonstrated above) nomenclature.

## Is VAERS lost?

### Resolving the MedDRA issue

Unlike quite a few people in the field, I don’t think VAERS is hopelessly lost. There’s, in fact, great potential in it. But the way it integrates with MedDRA has to be changed. This is both a moral point – a point of commitment to opening up government information – and one of facilitating research.

There are two alternatives at this point for the CDC.

1. MedDRA has to open up at least the 17% of codes, complete with hierarchy, that are used within VAERS. These should be accessible, complete with the hierarchy, within VAERS, including the CDC WONDER interface.
2. The CDC has to switch to a more suitable system. ICD-10 alone is not necessarily the best solution, and there are few alternatives, which puts MedDRA into a monopoly position that it seems to mercilessly exploit at the time. This can – and should – change.

### Moving past the Single Field Problem

MedDRA apart, it is crucial for VAERS to resolve the Single Field Problem. It is clear that from the issues presented in the first paragraph – a broken femur, Henoch-Schönlein purpura, fainting, an expired vaccine and a healthy childbirth – that there is a range of issues that need to be logged. A good structure would be

1. pre-existing conditions and risk factors,
2. symptoms that arose within 6 hours of administration,
3. symptoms that arose within 48 hours of administration,
4. symptoms that arose later than 48 hours of administration,
5. non-symptoms,
6. clinical tests without results,
7. clinical tests segmented by positive and negative results, and
8. ancillary circumstances, esp. circumstances pertaining to vaccination errors such as wrong vaccine administered, expired vaccine, etc.

The use of this segmentation would be able to differentiate not only time of occurrence, but also allow for adequate filtering to identify the correct denominators for the $PRR$.

### A future with (for?) MedDRA

As said, I am not necessarily hostile to MedDRA, even if the closet libertarian in me bristles at the fact that MedDRA is mercilessly exploiting what is an effective monopoly position. But MedDRA can be better, and needs to be better – if not for its own economic interests, then for the interests of those it serves. There are three particular suggestions MedDRA needs to seriously consider.

1. MedDRA’s entity structure is valuable – arguably, it’s the value in the entire project. If coding can be structured to reflect its internal hierarchy, MedDRA becomes parseable without a LUT,1 and kinship structures become parseable without the extra step of a LUT.
2. MedDRA needs to open up, especially to researchers not falling within its narrowly defined confines of access. Especially given the inherent public nature of its use – PhV and regulation are quintessentially public functions, and this needs an open system.
3. MedDRA’s entity structure’s biggest strength is that it comprises a range of different things, from administrative errors through physical injuries to test results and the simple fact of tests.

## Conclusion

VAERS is a valuable system with a range of flaws. All of them are avoidable and correctable – but would require the requisite level of will and commitment – both on CDC’s side and that of MedDRA. For any progress in this field, it is imperative that the CDC understand that a public resource maintained in the public interest cannot be driven by a proprietary nomenclature, least of all one that is priced out of the range of the average interested individual: and if they cannot be served, does the entire system even fulfill its governmental function of being of the people and for the people? It is ultimately CDC’s asset, and it has a unique chance to leverage its position to ensure that at least as far as the 17% of MedDRA codes go that are used in VAERS, these are released openly.

In the end, however sophisticated our dissimilarity metrics, when 30% of all entities are non-symptoms and we need to manually prune the key terms to avoid denominator bloat due to non-symptom entities, such as diagnostic tests without results or clearly unconnected causes of morbidity and mortality like motor vehicle accidents, dissimilarity based approaches will suffer from serious flaws. In the absence of detailed administration and symptom tracking at an individual or institutional level, dissimilarity metrics are the cheapest and most feasible ways of creating value out of post marketing passive reports. If VAERS is to be a useful research tool, as I firmly believe it was intended to be, it must evolve to that capability for all.

References   [ + ]

 1 ↑ Look-up table

# SafeGram: visualising drug safety

Update: an RMarkdown notebook explaining the whole process is available here.

Visualising vaccine safety is hard. Doing so from passive (or, as we say it in Britain, ‘spontaneous’!) pharmacovigilance (PhV) sources is even harder. Unlike in active or trial pharmacovigilance, where you are essentially dividing the number of incidents by the person-time or the number of patients in the cohort overall, in passive PhV, only incidents are reported. This makes it quite difficult to figure out their prevalence overall, but fortunately, we have some metrics we can use to better understand the issues with a particular medication or vaccine. The proportional reporting ratio ($PRR$) is a metric that can operate entirely on spontaneous reporting, and reflect how frequent a particular symptom is for a particular treatment versus all other treatments.

#### Defining $PRR$$PRR$

For convenience’s sake, I will use the subscript $*$ operator to mean a row or column sum of a matrix, so that

$N_{i,*} = \displaystyle \sum_{j=1}^{n} N_{i,j}$

and

$N_{*,j} = \displaystyle \sum_{i=1}^{m} N_{i,j}$

and furthermore, I will use the exclusion operator $* \neg$ to mean all entities except the right hand value. So e.g.

$N_{i, * \neg k} = \displaystyle \sum_{j=1, j \neq k}^m N_{i,j}$

Conventionally, the PRR is often defined to with reference to a 2×2 contingency table that cross-tabulates treatments ($m$ axis) with adverse effects ($n$ axis):

($i$)
($\neg i$)
TOTAL
Treatment of interest
($j$)
$a = D_{i,j}$ $b = D_{i, * \neg j}$ $a + b = D_{i, *} = \displaystyle \sum_{j = 1}^{n} D_{i, j}$
All other treatments
($\neg j$)
$c = D_{* \neg i, j}$ $d = D_{* \neg i, * \neg j}$ $c + d = D_{* \neg i, *} = \displaystyle \sum_{k=1, k \neq i}^{m} \sum_{l = 1}^{n} D_{k, l}$

With reference to the contingency table, the $PRR$ is usually defined as

$\frac{a / (a+b)}{c / (c+d)} = \frac{a}{a + b} \cdot \frac{c + d}{c}$

However, let’s formally define it over any matrix $D$.

Definition 1. $PRR$. Let $D$ be an $m \times n$ matrix that represents the frequency with which each of the $m$ adverse effects occur for each of the $n$ drugs, so that $D_{i,j}$ ($i \in m$, $j \in n$) represents the number of times the adverse effect $j$ has occurred with the treatment $i$.

For convenience’s sake, let $D_{*,j}$ denote $\sum_{i=1}^{m} D_{i,j}$, let $D_{i,*}$ denote $\sum_{j=1}^{n} D_{i,j}$, and let $D_{*,*}$ denote $\sum_{i=1}^{m} \sum_{j=1}^{n} D_{i,j}$. Furthermore, let $D_{* \neg i, j}$ denote $\sum_{k \neq i}^{m} D_{k,j}$ and $D_{i, * \neg j}$ denote $\sum_{k \neq j}^{n} D_{i, k}$.

Then, $PRR$ can be calculated for each combination $D_{i,j}$ by the following formula:

$PRR_{i,j} = \frac{D_{i,j} / D_{i,*}}{D_{* \neg i, j} / D_{* \neg i, *}} = \frac{D_{i,j}}{D_{i,*}} \cdot \frac{D_{*\neg i, *}}{D_{*\neg i, j}}$

Expanding this, we get

$PRR_{i,j} = \frac{D_{i,j}}{\displaystyle\sum_{q=1}^n D_{i,q}} \cdot \frac{\displaystyle\sum_{r=1, r\ne i}^{m} \displaystyle\sum_{s=1}^{n} D_{r,s}}{\displaystyle\sum_{t=1, t\ne i}^{m} D_{t,j}}$

Which looks and sounds awfully convoluted until we start to think of it as a relatively simple query operation: calculate the sum of each row, then calculate the quotient of the ADR of interest associated with the treatment of interest divided by all uses of the treatment of interest on one hand and the ADR of interest associated with all other drugs ($j \mid \neg i$ or $c$) divided by all ADRs associated with all treatments other than the treatment of interest. Easy peasy!

### Beyond $PRR$$PRR$

However, the PRR only tells part of the story. It does show whether a particular symptom is disproportionately often reported – but does it show whether that particular symptom is frequent at all? Evans (1998) suggested using a combination of an $N$-minimum, a $PRR$ value and a chi-square value to identify a signal.1 In order to represent the overall safety profile of a drug, it’s important to show not only the $PRR$ but also the overall incidence of each risk. The design of the SafeGram is to show exactly that, for every known occurred side effect. To show a better estimate, instead of plotting indiviual points (there are several hundreds, or even thousands, of different side effects), the kernel density is plotted.

The reason why SafeGrams are so intuitive is because they convey two important facts at once. First, the PRR cut-off (set to 3.00 in this case) conclusively excludes statistically insignificant increases of risk.2 Of course, anything above that is not necessarily dangerous or proof of a safety signal. Rather, it allows the clinician to reason about the side effect profile of the particular medication.

• The meningococcal vaccine (left upper corner) had several side effects that occurred frequently (hence the tall, ‘flame-like’ appearance). However, these were largely side effects that were shared among other vaccines (hence the low PRR). This is the epitome of a safe vaccine, with few surprises likely.
• The injectable polio vaccine (IPV) has a similar profile, although the wide disseminated ‘margin’ (blue) indicates that ht has a wider range of side effects compared to the meningococcal vaccine, even though virtually all of these were side effects shared among other vaccines to the same extent.
• The oral polio vaccine (OPV, left bottom corner) shows a flattened pattern typical for vaccines that have a number of ‘peculiar’ side effects. While the disproportionately frequently reported instances are relatively infrequent, the ‘tail-like’ appearance of the OPV SafeGram is a cause for concern. The difference between meningococcal and IPV on one hand and OPV on the other is explained largely by the fact that OPV was a ‘live’ vaccine, and in small susceptible groups (hence the low numbers), they could provoke adverse effects.
• The smallpox vaccine, another live vaccine, was known to have a range of adverse effects, with a significant part of the population (about 20%) having at least one contraindication. The large area covered indicates that there is a rather astonishing diversity of side effects, and many of these – about half of the orange kernel – lies above the significance boundary of 3.00. The large area covered by the kernel density estimate and the reach into the right upper corner indicates a very probable safety signal worth examining.

### Interpretation

A SafeGram for each vaccine shows the two-dimensional density distribution of two things – the frequency and the proportional reporting rate of each vaccine (or drug or device or whatever it is applied to). When considering the safety of a particular product, the most important question is whether a particular adverse effect is serious – a product with a low chance of an irreversible severe side effect is riskier than one with a high probability of a relatively harmless side effect, such as localized soreness after injection. But the relative severity of a side effect is hard to quantify, and a better proxy for that is to assume that in general, most severe side effects will be unique to a particular vaccine. So for instance while injection site reactions and mild pyrexia following inoculation are common to all vaccines and hence the relative reporting rates are relatively low, reflecting roughly the number of inoculations administered, serious adverse effects tend to be more particular to the vaccine (e.g. the association of influenza vaccines with Guillain-Barré syndrome in certain years means that GBS has an elevated PRR, despite the low number of occurrences, for the flu vaccines). Discarding vaccines with a very low number of administered cases, the SafeGram remains robust to differences between the number of vaccines administered. Fig. 1 above shows a number of typical patterns. In general, anything to the left of the vertical significance line can be safely ignored, as they are generally effects shared between most other vaccines in general and exhibit no specific risk signal for the particular vaccine. On the other hand, occurrences to the right of the vertical significance line may – but don’t necessarily do – indicate a safety signal. Of particular concern are right upper quadrant signals – these are frequent and at the same time peculiar to a particular vaccine, suggesting that it is not part of the typical post-inoculation syndrome (fever, fatigue, malaise) arising from immune activation but rather a specific issue created by the antigen or the adjuvant. In rare cases, there is a lower right corner ‘stripe’, such as for the OPV, where a wide range of unique but relatively infrequent effects are produced. These, too, might indicate the need for closer scrutiny. It is crucial to note that merely having a density of signals in the statistically significant range does not automatically mean that there is a PhV concern, but rather that such a concern cannot be excluded. Setting the PRR significance limit is somewhat arbitrary, but Evans et al. (2001) have found a PRR of 2, more than 3 cases over a two year period and a chi-square statistic of 4 or above to be suggestive of a safety signal. Following this lead, the original SafeGram code looks at a PRR of 3.0 and above, and disregards cases with an overall frequency of $3Y$, where $Y$ denotes the number of years considered.

### Limitations

The SafeGram inherently tries to make the best out of imperfect data. Acknowledging that passive reporting data is subject to imperfections, some caveats need to be kept in mind.

• The algorithm assigns equal weight to every ‘symptom’ reported. VAERS uses an unfiltered version of MedDRA, a coding system for regulatory activities, and this includes a shocking array of codes that do not suggest any pathology. For instance, the VAERS implementation of MedDRA contains 530 codes for normal non-pathological states (e.g. “abdomen scan normal”), and almost 18,000 (!) events involve at least one of these ‘everything is fine!’ markers. This may be clinically useful because they may assist in differential diagnosis and excluding other causes of symptoms, but since they’re not treated separately from actually pathological symptoms, they corrupt the data to a minor but not insignificant extent. The only solution is manual filtering, and with tens of thousands of MedDRA codes, one would not necessarily be inclined to do so. The consequence is that some symptoms aren’t symptoms at all – they’re the exact opposite. This is not a problem for the $PRR$ because it compares a symptom among those taking a particular medication against the same symptom among those who are not.
• A lot of VAERS reports are, of course, low quality reports, and there is no way for the SafeGram to differentiate. This is a persistent problem with all passive reporting systems.
• The SafeGram gives an overall picture of a particular drug’s or vaccine’s safety. It does not differentiate between the relative severity of a particular symptom.
• As usual, correlation does not equal causation. As such, none of this proves the actual risk or danger of a vaccine, but rather the correlation or, in other words, potential safety signals that are worth examining.

SafeGrams are a great way to show the safety of vaccines, and to identify which vaccines have frequently occurring and significantly distinct (high-$PRR$) AEFIs that may be potential signals. It is important to note that for most common vaccines, including controversial ones like HPV, the centre of the density kernel estimate are below the margin of the $PRR$ signal limit. The SafeGram is a useful and visually appealing proof of the safety of vaccines that can get actionable intelligence out of VAERS passive reporting evidence that is often disregarded as useless.

References   [ + ]

 1 ↑ Evans, S. J. W. et al. (1998). Proportional reporting ratios: the uses of epidemiological methods for signal generation. Pharmacoepidemiol Drug Saf, 7(Suppl 2), 102. 2 ↑ According to Evans et al., the correct figure for PRR exclusion is 2.00, but they also use N-restriction and a minimum chi-square of 4.0.

# Out for tapas with Katie and Dad!

Pata Negra is an excellent little tapas place near the riverside, halfway between the hill we live on and Újlak to the north. The food is sometimes a little hit and miss – the red wine chorizo is great, but some staples are not that brilliant and others are outright odd. Still, overall a great place with decent atmosphere and a Sangría that should come with its own weekend safety brief  ⭐️⭐️⭐️⭐️

Taken on Feb 25, 2018 @ 20:16 near Pata Negra Budapest, this photo was originally posted on my Instagram. You can see the original on Instagram by clicking here.

# Krak’n Town is crackin’!

Krak’n Town is crackin’! We visited Budapest’s one and only steampunk themed restaurant for a belated Valentine’s Day dinner, and had an absolute blast. Great place, good atmosphere and quirky yet yummy dishes! Highly recommended. ⭐️⭐️⭐️⭐️⭐️

Taken on Feb 15, 2018 @ 17:45 near Krak'n Town, this photo was originally posted on my Instagram. You can see the original on Instagram by clicking here..