Sampling Bias – a tale of three Covid news stories

If you spend all your time with elephants, you might think that all animals are huge. In any experiment, survey or study, the results we see depend critically on the choice of people or things we consider or measure.

Three recent Covid-19 news stories show the serious (and in one case less serious) impact of sampling bias, potentially creating misleading or invalid results.


  • Story 1 – 99.9% of deaths are unvaccinated – An ONS report in mid-September was widely misinterpreted and led to the mistaken impression that virtually all UK deaths were amongst those who were unvaccinated.  This is not true: whilst vaccination has massively reduced deaths and serious illness, Covid-19 is still a serious illness even for those who are fully jabbed.
  • Story 2 – Lateral flow tests work – They do! False positives are known to be rare (if it says you’ve got it you probably have), but data appears to suggest that false negatives (you get a negative result, but actually have Covid) are much higher.  Researchers at UCL argue that this is due to a form of sampling bias and attempt to work out the true figure … although in the process they slightly overshoot the mark!
  • Story 3 – Leos get their jabs – Analysis of vaccination data in Utah found that those with a Leo star sign were more than twice as likely to be vaccinated than Libras or Scorpios.  While I’d like to believe that Leos are innately more generous of spirit, does your star sign really influence your likelihood of getting a jab?

In the last story we also get a bit of confirmation bias and the  file-drawer effect to add to the sampling bias theme!

Let’s look at each story in more detail.

Story 1 – 99.9% of deaths are unvaccinated

I became aware of the first story when a politician on the radio said that 99.9% of deaths in the UK were of unvaccinated people.  This was said I think partly to encourage vaccination and partly to justify not requiring tougher prevention measures.

The figure surprised me for two reasons:

  1. I was sure I’d seen figures suggesting that there were still a substantial number of ‘breakthrough infections’ and deaths, even though the vaccinations were on average reducing severity.
  2. As a rule of thumb, whenever you hear anything like “99% of people …” or “99.9% of times …”, then 99% of the time (sic) the person just means “a lot”.

Checking online newspapers when I got home I found the story that had broken that morning (13th Sept 2021) based on a report by the Office of National Statistics, “Deaths involving COVID-19 by vaccination status, England: deaths occurring between 2 January and 2 July 2021“.  The first summary finding reads:

In England, between 2 January and 2 July 2021, there were 51,281 deaths involving coronavirus (COVID-19); 640 occurred in people who were fully vaccinated, which includes people who had been infected before they were vaccinated.

Now 640 fully vaccinated deaths out of 51,281 is a small proportion leading to newspaper headlines and reports such as “Fully vaccinated people account for 1.2% of England’s Covid-19 deaths” (Guardian) or “Around 99pc of victims had not had two doses” (Telegraph).

In fact in this case the 99% figure does reflect the approximate value from the data, the politician had simply added an extra point nine for good measure!

So, ignoring a little hyperbole, at first glance it does appear that nearly all deaths are of unvaccinated people, which then suggests that Covid is pretty much a done deal and those who are fully vaccinated need not worry anymore.  What could be wrong with that?

The clue is in the title of the report “between 2 January and 2 July 2021“.  The start of this period includes the second wave of Covid in the UK.  Critically while the first few people who received the Pfizer vaccine around Christmas-time were given a second dose 14 days later, vaccination policy quickly changed to leave several months between first and second vaccine doses. The vast majority of deaths due to Covid during this period happened before mid-February, at which point fewer than half a million people had received second doses.

That is, there were very few deaths amongst the fully vaccinated, in large part because there were very few people doubly vaccinated.  Imagine the equivalent report for January to July 2020, of 50 thousand deaths there would have been none at all of the fully vaccinated.

This is a classic example of sampling bias, the sample during the times of peak infection was heavily biased towards the unvaccinated, making it appear that the ongoing risk for the vaccinated was near zero.

The ONS report does make the full data available.  By the end of the period the number who were fully vaccinated had grown to over 20 million. The second wave had long passed and both the Euros and England’s ‘Freedom Day’ had not yet triggered rises in cases. Looking below, we can see the last five weeks of the data, zooming into the relevant parts of the ONS spreadsheet.

Notice that the numbers of deaths amongst the fully vaccinated (27, 29, 29, 48, 63) are between one-and-a-half and twice as high as those amongst the unvaccinated (18, 20, 13, 26, 35 ).  Note that this is not because the vaccine is not working; by this point the vaccinated population is around twice as high as the unvaccinated (20 million to 10 million). Also, as vaccines were rolled out first to the most vulnerable, these are not comparing similar populations (more sampling bias!).

The ONS do their best to correct for the latter sampling bias and the column (slightly confusingly) labelled “Rate per 100,000 population“, uses the different demographics to estimate the death rate if everyone were in that vaccination bracket. That is, in the week ending 2nd July (last line of the table) if everyone were unvaccinated one would expect 1.6 deaths per 100,000 whereas if everyone were vaccinated, we would expect 0.2 deaths per 100,000.

It is this (buried and complex) figure which is actually the real headline – vaccination is making a ten-fold improvement.  (This is consonant with more recent data suggesting a ten-fold improvement for most groups and a lower, but still substantial four-fold improvement for the over-80s.)  However, most media picked up the easier to express – but totally misleading – total numbers of deaths figures, leading to the misapprehension amongst some that it is “all over”.

To be fair the ONS report includes the caveat:

Vaccinations were being offered according to priority groups set out by the JCVI, therefore the characteristics of the vaccinated and unvaccinated populations are changing over time, which limits the usefulness of comparing counts between the groups.

However, it is somewhat buried and the executive summary does not emphasise the predictably misleading nature of the headline figures.


  • for Covid – Vaccination does make things a lot better, but the rate of death and serious illness is still significant
  • for statistics – Even if you understand or have corrected for sampling bias or other statistical anomalies, think about how your results may be (mis)interpreted by others

Story 2 – Lateral flow tests work

Lateral flow tests are the quick-and-dirty weapon in the anti-Covid armoury  They can be applied instantly, even at home; in  comparison the ‘gold standard’ PCR test can take several days to return.

The ‘accuracy’ of lateral flow tests can be assessed by comparing with PCR tests.  I’ve put ‘accuracy’ in scare quotes as there are multiple formal measures.

A test can fail in two ways:

  • False Positive – the test says you have Covid, but you haven’t.  – These are believed to be quite rare, partly because the tests are tuned not to give false alarms too often, especially when prevalence is low.
  • False Negative – the test says you don’t have Covid, but you really do. – There is a trade-off in all tests: by calibrating the test not to give too many false alarms, this means that inevitably there will be times when you actually have the disease, but test negative on a lateral flow test.  Data comparing lateral flow with PCR suggests that if you have Covid-19, there is still about a 50:50 chance that the test will be negative.

Note that the main purpose of the lateral flow test is to reduce the transmission of the virus in the population.  If it catches only a fraction of cases this is enough to cut the R number. However, if there were too many false positive results this could lead to large numbers of people needlessly self-isolating and potentially putting additional load on the health service as they verify the Covid status of people who are clear.

So the apparent high chance of false negatives doesn’t actually matter so much except insofar as it may give people a false sense of security.  However, researchers at University College London took another look at the data and argue that the lateral flow tests might actually be better than first thought.

In a paper describing their analysis, they note that a person goes through several stages during the illness; critically, you may test positive on a PCR if:

  1. You actively have the illness and are potentially infectious (called D2 in the paper).
  2. You have recently had the illness and still have a remnant of the virus in your system, but are no longer infectious (called D3 in the paper).

The virus remnants detected during the latter of these (D3) would not trigger a lateral flow test and so people tested with both during this period would appear to be a false negative, but in fact the lateral flow test would accurately predict that they are not infectious. While the PCR test is treated as ‘gold standard’, the crucial issue is whether someone has Covid and is infectious – effectively PCR tests give false positives for a period after the disease has run its course.

The impact of this is that the accuracy of lateral flow tests (in terms of the number of false negatives), may be better than previously estimated, because this second period effectively pollutes the results. There was a systematic sampling bias in the original estimates.

The UCL researchers attempt to correct the bias by using the relative proportion of positive PCR tests in the two stages D2/(D2+D3); they call this ratio π (not sure why).  They use a figure of 0.5 for this (50:50 D2:D3) and use it to estimate that the true positive rate (specificity) for lateral flow tests is about 80%, rather than 40%, and correspondingly the false negative rate only about 20%, rather than 60%.  If this is right, then this is very good news: if you are infectious with Covid-19, then there is an 80% chance that lateral flow will detect it.

The reporting of the paper is actually pretty good (why am I so surprised?), although the BBC report (and I’m sure others) does seem to confuse the different forms of test accuracy.

However, there is a slight caveat here, as this all depends on the D2:D3 ratio.

The UCL researchers use of 0.5 for π is based on published estimates of the period of detectable virus (D2+D3) and infectiousness (D2).  They also correctly note that the effective ratio will depend on whether the disease is growing or decaying in the population (another form of sampling bias similar to the issues in measuring the serial interval for the virus discussed in my ICTAC keynote).  Given that the Liverpool study on which they based their own estimates had been during a time of decay, they note that the results may be even better than they suggest.

However, there is yet another sampling bias at work!  The low specificity figures for lateral flow are always on asymptomatic individuals.  The test is known to be more accurate when the patient is already showing symptoms.  This means that lateral flow tests would only ever be applied in stage D3 if the individual had never been symptomatic during the entire infectious period of the virus (D2).  Early on it was believed that a large proportion of people may have been entirely asymptomatic; this was perhaps wishful thinking as it would have made early herd immunity more likely.  However a systematic review suggested that only between a quarter and a third of cases are never symptomatic, so that the impact of negative lateral flow tests during stage D3 will be a lot smaller than the paper suggests.

In summary there are three kinds of sampling effects at work:

  1. inclusion in prior studies of tests during stage D3 when we would not expect nor need lateral flow tests to give positive results
  2. relative changes in the effective number of people in stages D2 and D3 depending on whether the virus is growing or decaying in the population
  3. asymptomatic testing regimes that make it less likely that stage D3 tests are performed

Earlier work ignored (1) and so may under-estimate lateral flow sensitivity. The UCL work corrects for (1), suggesting a far higher accuracy for lateral flow, and discusses (2), which means it might be even better.  However, it misses (3), so overstates the improvement substantially!


  • for Covid – Lateral flow tests may be more accurate than first believed, but a negative test result does not mean ‘safe’, just less likely to be infected.
  • for statistics – (i) Be aware of time-based sampling issues when populations or other aspects are changing.  (ii) Even when you spot one potential source of sampling bias, do dig deeper; there may be more.

Story 3 – Leos get their jabs

Health department officials in Salt Lake County, Utah decided to look at their data on vaccination take-up.  An unexpected result was that there appeared to be  a substantial difference between citizens with different birth signs. Leos topped the league table with a 70% vaccination rate whilst Scorpios trailed with less than half vaccinated.

Although I’d hate to argue with the obvious implication that Leos are naturally more caring and considerate, maybe the data is not quite so cut and dried.

The first thing I wonder when I see data like this is whether it is simply a random fluke.  By definition the largest element in any data set tends to be a bit extreme, and this is a county, so maybe the numbers involved are quite large.  However, Salt Lake County is the largest county in Utah with around 1.2 million residents according to the US Census; so, even ignoring children or others not eligible, still around 900,000 people.

Looking at the full list of percentages, it looks like the average take-up is between 55% and 60%, with around 75,000 people per star sign (900,000/12).  Using my quick and dirty rule for this kind of data: look at the number of people in the smaller side (30,000 = 40% of 75,000); take its square root (about 170); and as it is near the middle multiply by 1.5 (~250).  This is the sort of variation one might expect to see in the data.  However 250 out of 75,000 people is only about 0.3%, so these variations of +/-10% look far more than a random fluke.

The Guardian article about this digs a little deeper into the data.

The Utah officials knew the birth dates of those who had been vaccinated, but not the overall date-of-birth data for the county as a whole.  If this were not uniform by star sign, then it could introduce a sampling bias.  To counteract this, they used national US population data to estimate the numbers in each star sign in the county and then divided their own vaccination figure by these estimated figures.

That is, they combined two sets of data:

  • their own data on birth dates and vaccination
  • data provided (according to the Guardian article) by University of Texas-Austin on overall US population birth dates

The Guardian suggests that in attempting to counteract sampling bias in the former, the use of the latter may have introduced a new bias. The Guardian uses two pieces of evidence for this.

  1. First an article in the journal Public Health Report that showed that seasonal variation in births varied markedly between states, so that comparing individiual states or counties with national data could be flawed.
  2. Second a blog post by Swint Friday of the College of Business Texas A&M University-Corpus Christi, which includes a table (see below) of overall US star sign prevalence that (in the Guardian’s words) “is a near-exact inverse of the vaccination one“, thus potentially creating the apparent vaccination effect.

Variations in birth rates through the year are often assumed to be in part due to seasonal bedtime activity: hunkering down as the winter draws in vs. short sweaty summer nights; while the Guardian, cites a third source, The Daily Viz, to suggest that “Americans like to procreate around the holiday period“. More seriously, the Public Health Report article also links this to seasonal impact on pre- and post-natal mortality, especially in boys.

Having sorted the data in their own minds, the Guardian reporting shifts to the human interest angle, interviewing the Salt Lake health officials and their reasons for tweeting this in the first place.

But … yes, there is always a but … the Guardian fails to check the various sources in a little more detail.

The Swint Friday blog has figures for Leo at 0.063% of the US population whilst Scorpio tops it at 0.094%, with the rest in between.  Together the figures add up to around 1% … what happened to the other 99% of the population … do they not have a star sign?  Clearly something is wrong, I’m guessing the figures are proportions not percentages, but it does leave me slightly worried about the reliability of the source.

Furthermore, the Public Health Report article (below) shows July-Aug (Leo period) slightly higher rather than lower in terms of birth date frequency, as does more recent US data on births.

from PASAMANICK B, DINITZ S, KNOBLOCH H. Geographic and seasonal variations in births. Public Health Rep. 1959 Apr;74(4):285-8. PMID: 13645872; PMCID: PMC1929236

Also, the ratio between largest and smallest figures in the Swint Friday table is about a half of the smaller figure (~1.5:1), whereas in the figure above it is about an eighth and in the recent data less than a tenth.

The observant reader might also notice the date on the graph above, 1955, and that it only refers to white males and females.  Note that this comes from an article published in 1959, focused on infant mortality and exemplifies the widespread structural racism in the availability of historic health data.  This is itself another form of sampling bias and the reasons for the selection are not described in the paper, perhaps it was just commonly accepted at the time.

Returning to the date, as well as describing state-to-state variation, the paper also surmises that some of this difference may be due to socio-economic factors and that:

The increased access of many persons in our society to the means of reducing the stress associated with semitropical summer climates might make a very real difference in infant and maternal mortality and morbidity.

Indeed, roll on fifty years, and looking at the graph at Daily Viz based on more recent US government birth data produced at Daily Viz, the variation is indeed far smaller now than it was in 1955.

from How Common Is Your Birthday? Pt. 2., the Daily Viz, Matt Stiles, May 18, 2012

As noted the data in Swint Friday’s blog is not consistent with either of these sources, and is clearly intended simply as a light-hearted set of tables of quick facts about the Zodiac. The original data for this comes from Statistics Brain, but this requires a paid account to access, and given the apparent quality of the resulting data, I don’t really want to pay to check! So, the ultimate origins of thsi table remains a mystery, but it appears to be simply wrong.

Given it is “a near-exact inverse” of the Utah star sign data, I’m inclined to believe that this is the source that Utah health officials used, that is data from the Texas A&M University, not Texas University Austin.  So in the end I agree with the Guardian’s overall assessment, even if their reasoning is somewhat flawed.

How is it that the Guardian did not notice these quite marked discrepancies in the data. I think the answer is confirmation bias, they found evidence that agreed with their belief (that Zodiac signs can’t affect vaccination status) and therefore did not look any further.

Finally, we only heard about this because it was odd enough for Utah officials to tweet about it.  How many other things did the Utah officials consider that did not end up interesting?  How many of the other 3000 counties in the USA looked at their star sign data and found nothing.  This is a version of the  file-drawer effect for scientific papers, where only the results that ‘work’ get published.  With so many counties and so many possible things to look at, even a 10,000 to 1 event would happen sometimes, but if only the 10,000 to one event gets reported, it would seem significant and yet be pure chance.


  • for Covid – Get vaccinated whatever your star sign.
  • for statistics – (i) Take especial care when combining data from different sources to correct sampling bias, you might just create a new bias. (ii) Cross check sources for consistency, and if they are not why not? (iii) Beware confirmation bias, when the data agrees with what you believe, still check it!  (iv) Remember that historical data and its availability may reflect other forms of human bias. (v) The file-drawer effect – are you only seeing the selected apparently unusual data?


How much does herd immunity help?

I was asked in a recent email about the potential contribution of (partial) herd immunity to controlling Covid-19.  This seemed a question that many may be asking, so here is the original question and my reply (expanded slightly).

We know that the virus burns itself out if R remains < 1.

There are 2 processes that reduce R, both operating simultaneously:

1) Containment which limits the spread of the virus.

2) Inoculation due to infection which builds herd immunity.

Why do we never hear of the second process, even though we know that both processes act together? What would your estimate be of the relative contribution of each process to reduction of R at the current state of the pandemic in Wales?

One of the UK government’s early options was (2) developing herd immunity1.  That is you let the disease play out until enough people have had it.
For Covid the natural (raw) R number is about 3 without additional voluntary or mandated measures (depends on lots of factors).   However, over time as people build immunity, some of those 3 people who would have been infected already have been.  Once about 2/3 of the community are immune the effective R number drops below 1.  That corresponds to a herd immunity level (in the UK) of about 60-70% of the population having been infected.  Of course, we do not yet know how long this immunity will last, but let’s be optimistic and assume it does.
The reason this policy was (happily) dropped in the UK was the realisation that this would need about 40 million people to catch the virus, with about 4% of these needing intensive care.  That is many, many times the normal ICU capacity, leading to (on the optimistic side) around half a million deaths, but if the health service broke under the strain many times that number!
In Spain (with one of the larger per capita outbreaks) they ran an extensive antibody testing study (that is randomly testing a large number of people whether or not they had had any clear symptoms), and found only about 5% of people showed signs of having had the virus overall, with Madrid closer to 10%.  In the UK estimates are of a similar average level (but without as good data), rising to maybe as high as 17% in London.
Nationally these figures (~5%) do make it slightly easier to control, but this is far below the reduction needed for relatively unrestricted living (as possible in New Zealand, which chose a near eradication strategy)   In London the higher level may help a little more (if it proves to offer long-term protection).  However, it is still well away from the levels needed for normal day-to-day life without still being very careful (masks, social distancing, limited social gatherings), however it does offer just a little ‘headroom’ for flexibility.  In Wales the average level is not far from the UK average, albeit higher in the hardest hit areas, so again well away from anything that would make a substantial difference.
So, as you see it is not that (2) is ignored, but, until we have an artificial vaccine to boost immunity levels, relying on herd immunity is a very high risk or high cost strategy.  Even as part of a mixed strategy, it is a fairly small effect as yet.
In the UK and Wales, to obtain even partial herd immunity we would need an outbreak ten times as large as we saw in the Spring, not a scenario I would like to contemplate 🙁
This said there are two caveats that could make things (a little) easier going forward:
1)  The figures above are largely averages, so there could be sub-communities that do get to a higher level.  By definition, the communities that have been hardest hit are those with factors (crowded accommodation, high-risk jobs, etc.) that amplify spread, so it could be that these sub-groups, whilst not getting to full herd-immunity levels, do see closer to population spread rates in future hence contributing to a lower average spread rate across society as a whole.  We would still be a long way from herd immunity, but slower spread makes test, track and trace easier, reduces local demand on health service, etc.
2)  The (relatively) low rates of spread in Africa have led to speculation (still very tentative) that there may be some levels of natural immunity from those exposed to high levels of similar viruses in the past.  However, this is still very speculative and does not seem to accord with experience from other areas of the world (e.g. Brazilian favelas), so it looks as though this is at most part of a more complex picture.
I wouldn’t hold my breath for (1) or (2), but it may be that as things develop we do see different strategies in different parts of the world depending on local conditions of housing, climate, social relationships, etc.


Having written the above, I’ve just heard about the following that came out end of last week in BMJ, which suggests that there could be a significant number of mild cases that are not
detected on standard blood test as having been infected.
Burgess StephenPonsford Mark JGill DipenderAre we underestimating seroprevalence of SARS-CoV-2?
  1. I should say the UK government now say that herd immunity was never part of their planning, but for a while they kept using the term! Here’s a BBC article about the way herd immunity influenced early UK decisions, a Guardian report that summarises some of the government documents that reveal this strategy, and a Politco article that reports on the Chief Scientific Adviser Patrick Vallance ‘s statement that he never really meant this was part of government planning.  His actual words on 12th March were “Our aim is not to stop everyone getting it, you can’t do that. And it’s not desirable, because you want to get some immunity in the population. We need to have immunity to protect ourselves from this in the future.”  Feel free to decide for yourself what ‘desirable‘ might have meant.[back]

lies, damned lies and obesity

2016-07-15 11.02.43 - inews-obesityFacts are facts, but the facts you choose to tell change the story, and, in the case of perceptions of the ‘ideal body’, can fuel physical and mental health problems, with consequent costs to society and damage to individual lives.

Today’s i newspaper includes an article entitled “Overweight and obese men ‘have higher risk of premature death’“.  An online version of the same article “Obese men three times more likely to die early” appeared online yesterday on the iNews website.  A similar article “Obesity is three times as deadly for men than women” reporting the same Lancet article appeared in yesterday’s Telegraph.

The text describes how moderately obese men die up to three years earlier than those of ‘normal’ weight1; clearly a serious issue in the UK given growing levels of child obesity and the fact that the UK has the highest levels of obesity in Europe.  The i quotes professors from Oxford and the British Heart Foundation, and the Telegraph report says that the Lancet article’s authors suggest their results refute other recent research which found that being slightly heavier than ‘normal’ could be protective and extend lifespan.

The things in the reports are all true. However, to quote the Witness Oath of British courts, it is not sufficient to tell “the truth”, but also “the whole truth”.

The Telegraph article also helpfully includes a summary of the actual data in which the reports are based.


As the articles say, this does indeed show substantial risk for both men and women who are mildly obese (BMI>30) and extreme risk for those more severely obese (BMI>35). However, look to the left of the table and the column for those underweight (BMI<18.5).  The risks of being underweight exceed those of being mildly overweight, by a small amount for men and a substantial amount for women.

While obesity is major issue, so is the obsession with dieting and the ‘ideal figure’, often driven by dangerously skinny fashion models.  The resulting problems of unrealistic and unhealthy body image, especially for the young, have knock-on impacts on self-confidence and mental health. This may then lead to weight problems, paradoxically including obesity.

The original Lancet academic article is low key and balanced, but, if reported accurately, the comments of at least one of the (large number of) article co-authors less so.  However, the eventual news reports, from ‘serious’ papers at both ends of the political spectrum, while making good headlines, are not just misleading but potentially damaging to people’s lives.


  1. I’ve put ‘normal’ in scare quotes, as this is the term used in many medical charts and language, but means something closer to ‘medically recommended’, and is far from ‘normal’ on society today.[back]

Statistics and individuals

Ramesh Ramloll recently posted on Facebook about two apparently contradictory news reports on vitamin D, one entitled “Recommendation for vitamin D intake was miscalculated, is far too low, experts say” and the other  “High levels of vitamin D is suspected of increasing mortality rates“.

While specifically about diet and vitamin D intake, there seems to be a number of lessons from this: about communication of science (Ramesh’s original reason for posting this), widespread statistical ignorance amongst scientists (amongst others), and the fact that individuals are not averages.

Ramesh remarked:

Science reporting is broken, or science itself is broken … the masses are like deer in headlights when contradictory recommendations through titles like these appear in the mass media, one week or so apart.

I know that rickets is currently on the increase in the UK, due partly to poverty and poor diets leading to low dietary vitamin D intake, and due partly to fear of harmful UV and skin cancer leading to under-exposure of the skin to sunlight, our natural means of vitamin D production.  So these issues are very important, and as Ramesh points out, clarity in reporting is crucial.

Looking at the two articles, the ‘too low’ article came from North America, the ‘too much’ article, although reported in AAAS ‘EurekaAlert!’ news, originated in University of Copenhagen, so I thought that maybe the difference is that health conscious Danes are simply overdosing.

However, even as a scientist, making sense of the reports is complicated by the fact that they talk in different units.  The ‘too low’ one is about dietary intake of vitamin D measured in ‘IU/day’, and the Danish ‘too much’ report discusses blood levels in ‘nanomol per litre’.  Wow that makes things easy!

Furthermore the Danish study (based on 247,574 Danes, real public health ‘big data’) showed the difference between ‘too much’ and ‘too little’, was a factor of two, 50 vs 100 nanomol/litre.  It suggests, Goldilocks fashion, that 70 nanomol/liter is ‘just right’.  Note however, the ‘EurekaAlert!’ news article does NOT quantify the relative risks of over and under dosing, which does make a big difference to the way they should be read as practical advice, and does not give a link to the source article to find out (this is the AAAS!).

Digging a little deeper into the “too low” news report, it is based on an academic article in the journal ‘Nutrients’,A Statistical Error in the Estimation of the Recommended Dietary Allowance for Vitamin D“, which is re-assessing the amount of dietary vitamin D to achieve the same 50 nanomol/litre level used as the ‘low’ level by the Danish researchers.  The Nutrients article is based not on a new study, but a re-examination of the original meta-study that gave rise to the (US and Canadian) Institute of Medicines current recommendations.   The new article points out that the original analysis confused study averages and individual levels, a pretty basic statistical mistake.

nutrients-06-04472-g001-1024  nutrients-06-04472-g002-1024

 Graphs from “A Statistical Error in the Estimation of the Recommended Dietary Allowance for Vitamin D“. LHS is study averages, RHS taking not account variation within studies.

A few things I took from this:

1)  The level of statistical ignorance amongst those making major decisions (in this case at the Institute of Medicine) is frightening. This is part of a wider issue of innumeracy, which I’ve seen in business/economic news reporting on the BBC, reporting of opinion polls in the Times, academic publishing and reviewing in HCI, and the list goes on.  This is an issue that has worried me for some time (see “Cult of Ignorance“, “Basic Numeracy“).

2) Just how spread the data is for the studies. I guess this is because individual differences and random environmental factors are so great.  This really brings home the importance of replication, which is so hard to get funded or published in many areas of academia, not least in HCI where individual differences and variations within studies are also very high.  But it also emphasises the importance of making sure data is published in such a way that meta-analysis to compare and combine individual studies is possible.

3) Individual difference are large.  Based on the revised suggested limits for dietary vitamin D, designed to bring at least 39/40 people over the recommended blood lower limit of 50 nanomol/litre, half of people would end up with blood levels higher than four or five times that lower limit, that is more than twice as high as the level the other study says leads to deleterious over-consumption levels.  This really brings home that diet and metabolism vary such a lot between people and we need to start to understand individual variations for health advice, not simply averages.  This is difficult, as illustrated by the spread of studies in the ‘too low’ article, but may become possible as more mass data, as used by the Danish study, becomes available.

In short:

individuals matter in statistics


statistics matter for individuals