It can be frustrating to watch the non-scientists who run our government being deferential to scientists who seem to have less than a full understanding of statistics. Possible approaches for mitigating the coronavirus are passed by, often it seems as a result of a lack of statistical training, most often a lack of understanding of the word “Significance”: what its proper meaning is, when used as the measure for deciding whether or not to adopt a particular suggested approach.
“Significance“ as a word in statistics usually means that there is less than 1 chance in 20 (or, in the more strict application, 1 in 100), of a result being random. Since hundreds of studies of different treatments or approaches are carried out in the medical world, many of them might show a “result“ that is the equivalent of flipping a coin. Flip it enough times, and heads will at some point come up 100 times in a row —but that doesn’t mean the coin is biased to heads. Statistics tells you, in this particular case through the law of large numbers, how often you’re likely to get such a result, and how big a result you have to get in order to be pretty sure that the result wasn’t by chance. Scientists are chary of proceeding on the basis that the test gave a valid result, given this danger that the result was just one lucky fluke in a swathe of different tests. (And they correctly raise the bar further by insisting that the trial be randomised, double-blind, etc.)
But by doing all this, the scientists can miss a valid result. Say you are doing a novel test, the first of its type (i.e no-one else has yet done a similar test), and you get an interesting result – but the statistics tell you, for example, that there is a 1 in 10 likelihood that this result is just random. This fails to meet the gold standard of “Significance“ so the medics advising the government shove the result to one side, if it has even got as far as catching their attention, which is mostly unlikely. They don’t investigate the matter further. But, remember, a 1 in 10 chance of randomness means there are nine in 10 chances that the result was not random!
Or, say there are then a number of similar (but not identical) tests, each showing a positive result but each just falling short of the gold standard (i.e. for each of them, only 9 chances in 10 that the good result is not random). You have to go well beyond the t-test for significance to find a proper statistical way to pull those together in a way that will decide whether these experiments collectively meet the gold standard, even when none of them individually do. The non-statistician medical and epidemiological scientists lack the statistical wherewithal to decide that – so again, the message comes: unless you bring us a double-blind randomised 99% confidence result, we won’t consider this possibly promising approach.
Take Vitamin D, for example. For months, there have been studies indicating that it significantly reduces both the incidence and effects of coronavirus, but they didn’t meet the gold standard Significance test, so they were pretty much ignored. The non-statistician scientists said that a gold standard study needed to be done before any action should be taken (this, interestingly, despite the fact that millions of doses of various vaccines have been commissioned before we have any indication that they will be effective).
Yet vitamin D is unbelievably cheap, particularly if bought in massive bulk by a government under a negotiated contract. It could have been made freely available, without massive cost, to all aged 70 and above, or with co-morbidities that make them vulnerable to the virus —especially for those locked-down care home residents who often have little access to the sunshine that creates vitamin D.
This shows another flaw of merely relying on the gold standard test: insufficient cost-benefit analysis of whether or not an approach should be adopted, even when it’s failed to meet the strict gold standard Significance test, but where nevertheless there have been strong indications of potential benefit. Those early vitamin D studies suggested that at a certain dosage, the impact of coronavirus was significantly mitigated. The potential downside of individuals taking a daily supplement at that level was pretty much non-existent.
The cost would have been low. The potential benefit was extremely high. Up to 50% of various populations are vitamin D deficient. In the US, studies have shown that up to three quarters of African-Americans are deficient – which alone might explain the racial disparities in Covid occurrence and outcomes. But this possible approach, of giving out free vitamin D to vulnerable populations, has been universally ignored — mostly, it seems to me, because of the totemic status attached to Significance, and the failure to look at potential cost-benefit.
And now, months later, a gold standard test of vitamin D on coronavirus has finally arrived. It shows vitamin D has extraordinary benefits, reducing incidence, seriousness and mortality by 50% or more. Those who swiftly adopted a daily vitamin D regimen several months ago may quietly pat themselves on the back; but much more seriously, will there now be a swift concerted rollout of prophylactic vitamin D to vulnerable populations? Is Sage urgently looking at this now?
Undoubtedly, there will eventually need to be some kind of review of whether there was any kind of discussion within Sage, in past months, of giving out prophylactic vitamin D; and if not, why was there not; and if there was, why was the decision not made to go ahead with widespread free distribution of vitamin D? It would seem from this new gold standard study that not only would multiple lives have been saved had that been done several months ago, but many of the other distressing consequences of Covid could have been avoided.
Of course, I am not suggesting the Significance test should be ignored. Scientists are right to demand that studies reach Significance and can be replicated before results are relied on. There’s a plague of pseudoscience out there, which should stop us doing risky and costly things before we are sure. But, that should not stop us doing low-risk, low-cost things that might work and that have low downside.
The usual application of the much-abused precautionary principle is “don’t wait for overwhelming evidence before banning something that might be dangerous”, but the very same principle should tell us “don’t wait for overwhelming evidence before trying something affordable that might save lives”. And, as an eminent statistician friend of mine has said to me, on reading what I wrote above: “There are medics and there are statisticians. I think medics were typically brought up on the old approach and journals full of p-valued papers, as you say, missing out on potentially important results.”
How much does Sage suffer from groupthink? Where is the equivalent of Churchill’s Solly Zuckerman, to act as a goad, and push the monolithic health establishment into innovative thinking? Why is someone like Matt Ridley (like Zuckerman a zoologist by training), or a similarly iconoclastic but scientifically trained individual, not included in the discussions? If there ever is a truly all-embracing enquiry into the establishment’s response to the Covid crisis, all these questions will loom large, and the answers will be uncomfortable.
Meantime, it would probably be a good idea to put a few more rough-and-tumble statisticians in the mix, to gently steer our Sage scientists away from too naive an understanding of what the statistics are actually telling them, what the indicated steps in the real world might be, when Significance might not yet have been achieved, but enormous benefit risks being lost in the meantime.
Click here to subscribe to our daily briefing – the best pieces from CapX and across the web.
CapX depends on the generosity of its readers. If you value what we do, please consider making a donation.