Misrepresenting Vaccine Statistics: Probably Bullshit (p=0.01)

How can five different statements about the efficacy of the same vaccine all be technically correct, while at least four of them are misleading, probably intentionally?


A recent article in The Guardian “UK defends Oxford vaccine as Germany advises against use on over-65s” provides a perfect example.

Luckily the article contains one statement that is both accurate and not misleading, so I’ll start with that (emphasis mine):

It is really important to read what has been said. German scientists conclude that the Oxford/AstraZeneca vaccine is safe and it is effective for under-65s. Their assessment is that effectiveness is not yet demonstrated for over-65s. They have not said the vaccine is ineffective for over-65s. Scientists often disagree about how much evidence is needed for any new advance and there is always more data to be secured.

Professor Jim Naismith, Professor of structural biology at the University of Oxford

Paraphrasing, this is what the German regulators have said “There is good evidence that the AZ vaccine works safely for people up to 65. There isn’t enough evidence yet to show whether or not it works, safely, for people over 65, so until we have more evidence we won’t use it on them.”

By “enough evidence” they mean “statistically significant evidence”, evidence in which they can have confidence, and that’s the critical element.

Statistical significance: or, how confident are we this isn’t a fluke?

If you’re a bit hazy on statistical significance, please keep reading this quick explainer because this concept lies at the heart of the problem with all of the apparently contradictory statements in The Guardian article. If you understand statistical significance, confidence intervals, p values and so on, then skip to the next section.

Stick with me, there won’t be any mathematics…

Let’s start with a hypothetical problem: you’ve developed a Covid-19 vaccine, and you want to find out whether it works. You select 200 random people, and you give 100 of them the vaccine—the treatment group—and 100 of them—the controls—you give a harmless saline solution. Then you wait a couple of months.

Two months later, two of the control group have got married, while only one of the treatment group has…

“Hang on!” you cry. “What has getting married got to do with a Covid-19 vaccine?”

Maybe nothing, maybe something. What if the vaccine significantly suppresses sex-drive, leading, amongst other things, to fewer marriages? That’s a potentially serious side-effect. How can we tell whether the difference in number of marriages is just due to chance, or whether it’s actually a statistically significant difference? To put it another way, how confident are we that we’re seeing a real effect of the vaccine, and not just random variation that’s totally unrelated. A difference of one in a couple of hundred is quite possibly due to chance.

Now, say we did the same trial twenty times, and nineteen out of twenty times, twice as many people in the control group got married as in the treatment group. At that point we’d have pretty strong evidence that the vaccine really is affecting something other than Covid-19 resistance.

But how do we make that decision if we only have the results of a single trial, with such a tiny difference in numbers? The answer is statistics. Using the size of the sample, and the size of the effect we can calculate how likely the effect we’ve seen—in this case marriage—is due to our vaccine, or is just chance, and how confident we are of that. The larger the number of people, and the larger the difference between the groups, the greater our confidence.

In the real world, and particularly with urgent vaccine trials, nobody can afford to do the same trial twenty times. So when you have only a single trial, and you have a small number of people in a particular sub-group—say over 65s—and possibly only a small difference between the treatment and control groups it’s impossible to say with any degree of confidence, that the difference is a real effect and not just chance. You could possibly say that the results are “suggestive” or “encouraging”, but in reality you can’t be certain that in a repeat trial or a larger trial they won’t turn out to be just chance.

That’s precisely the problem with the AstraZeneca trial. The vaccine was tested on a large group of people aged 18 to 55 but only a small number over 55, and an even small number over 65. There are results, but the numbers are so small it’s not possible to say with confidence that they are significant and not just due to chance, and that’s true both for efficacy and for possible harmful side-effects.

Lying by omission and misdirection

So now we understand the reason for the Germans’ decision: the data they were provided didn’t have enough results for people over 65 for them to draw any confident conclusions. They’ve drawn no conclusions either way, because there isn’t enough data.

So now, let’s parse what everyone has said about their decision.

Let’s start with AstraZeneca themselves:

Reports that the AstraZeneca/Oxford vaccine efficacy is low in adults over 65 years are not an accurate reflection of the totality of the data.

AstraZeneca, 2021-01-28

Notice the important difference between what the German authorities actually said, and what AstraZeneca is pretending they said. The Germans never said “efficacy is low”. Neither did they say “efficacy is high”. They said “efficacy is not demonstrated in a statistically significant way in which we can have confidence”. So AstraZeneca’s statement actually has no relevance to what the Germans said, although it sounds like a science-based refutation.

Notice also that AstraZeneca’s statement is scrupulously true. Nobody can truthfully say that efficacy is low. What they’re conveniently not saying is that, equally, nobody can truthfully say that the efficacy is high; in fact nobody can say—with sufficient confidence—what the efficacy is. So they’re actually making a very misleading statement which, superficially, sounds as though not only are the Germans wrong, but that the vaccine is efficacious in over 65s. Misleading by omission.

In non-statistical terms, bullshit.

Now the UK regulators:

Current evidence does not suggest any lack of protection against Covid-19 in people aged 65 or over. The data we have shows that the vaccine produces a strong immune response in the over-65s.

Dr June Raine, the chief executive of the UK Medicines and Healthcare products Regulatory Agency (MHRA)

Again, argument by misdirection and omission. Dr Raine could equally truthfully have said “current evidence does not reliably suggest a protection against Covid-19 in people aged 65 or over,” because the evidence isn’t strong enough to make either statement, which is why she has used that key word “suggest”. However she chose the former and omitted the latter. She goes on to say “the data we have shows…”, which, again, is true but only as far as it goes. What she conveniently didn’t go on to say is “however, we don’t have enough data to say whether we can have much confidence in this result”. Misleading by omission.

Now the UK Prime Minister:

This is a local German decision and the EMA will, as I understand it, be approving it for general use. I think that’s very sensible of the EMA, because that is the vaccine our own MHRA has said produces an immune response in all age groups, as a good vaccine, so I’m confident about it.

UK Prime Minister, Boris Johnson

Boris Johnson is a politician, so we must set the bar low in our expectations of transparency or even truthfulness. First of all he predicts the outcome of an independent foreign expert group of which he’s not a member, saying “as I understand it”, which is political for “I actually have no idea, but this is what I hope for”. Then, like all the others, he’s avoided an outright lie by following with a misleading true statement. He’s said the vaccine “produces an immune response in all age groups,” but leaves out the critical piece of information, which is “however we don’t have enough data to say with any confidence how reliable or strong the immune response is in over 65s.” So, misleading by omission.

Finally an “independent” expert who happens to work in the UK:

Every country is in a different situation. Advice on the use of a vaccine will depend on its availability and the availability of other vaccines. For the Oxford/AstraZeneca vaccine, in Germany and the rest of the EU there is a shortage, as is well known. It must be emphasised that this is not a regulatory decision but draft advice on usage. It is in a context where supplies of the Pfizer/BioNTech vaccine, for which data in older people shows similar efficacy as in younger people, are relatively plentiful.

Professor Stephen Evans, London School of Tropical Medicine and Hygiene

Subtle misdirection here. Evans is suggesting that the Germans’ decision is due to combined shortage of AZ and the availability of Pfizer, which may be true, however it’s not what the German regulator said. He does point out that there is data for efficacy in over-65s for the Pfizer vaccine, but omits the equally important observation that there isn’t data for AZ, which is precisely the reason the Germans did give. So he’s invented a reason for the Germans that they didn’t give and has omitted the one they did, and in doing so is subtly suggesting that if they didn’t have Pfizer they would have approved AZ. Of course, there’s no evidence for that at all, and again, while being true as far as it goes, it’s misleading. In fact in this case it’s misleading by commission and omission.

So, in summary, five descriptions of the same vaccine and the same situation, only one of which is both true and complete, and four others which are deliberately misleading by omission or commission or both. Perhaps unfortunately for The Guardian, they’ve repeated the misleading statement by the PM and MHRA under the headline, without the Germans’ rejoinder “because there isn’t enough data”.

As the saying goes there are lies, damn lies, and statistics…


Two of the agencies quoted have hinted at the possibility that the Germans’ decision might be based either on insufficient data or incorrect statistical reasoning. AstraZeneca themselves refer to “the totality of the data”, and the MHRA refers to “the data we have”.

In AstraZeneca’s case, this is really a bit of an insult. You’d have to assume that any drug company seeking approval would take great pains to provide the regulatory bodies with all available evidence, particularly if a possible problem was insufficient evidence. So we have to assume that the Germans indeed have “the totality of the data”, even if that’s more than is currently in the public domain. So what AstraZeneca is really saying is “you’re not doing your calculations correctly,” or at least “our conclusions differ from yours,” but in mentioning the data they’re implying that it’s more likely to be calculations.

In the case of MHRA, saying “the data we have” could be either a get-out clause in the case where they’ve made a decision based on earlier data, or again it’s a subtle insult that suggests that if the Germans are operating on the same data that they’ve reached the wrong conclusion. In both cases there’s really no need to mention the data unless there’s a genuine possibility that someone hasn’t got all of it, in which case you should be urgently addressing that, not issuing press releases.

Mentioning the available data does highlight the problem faced by independent observers, however, which is that all of us can only operate on data that has been published in the public domain. It goes without saying that on a subject as critical to global health as this there shouldn’t be anything that’s being used for public policy making that isn’t also available to the public, and hiding behind unpublished information presents as much risk as not enough data in the first place. 

Obligatory disclaimer: I am not an epidemiologist, virologist, vaccine expert or expert in public health. I do have a tertiary degree containing mathematics, statistics and logic, and all the facts in here are drawn or derived from quoted published references, however all of this is purely my personal opinion. If you have any health concerns, please consult a qualified health professional. May contain nuts, keep away from small children and naked flames.

One thought on “Misrepresenting Vaccine Statistics: Probably Bullshit (p=0.01)

  1. Pingback: I Have Vaccine Questions, Morrison’s Still Not Answering | infinite8horizon - peter d barnes

Comments are closed.