Universities and Covid – how bad was it and what next?

A record number of students have been heading to universities over the last few weeks.  They will still face Covid-restriction, however, happily the situation will be nothing like last year.

Last year I had my own concerns early on, and in retrospect it is easier to assess just how bad things were.  Combining SAGE’s Sept 2020 estimates of the impact with actual Covid mortality would suggest that during 2020-2021 there was an additional death for every 50-100 university students educated. There are arguments to reduce this figure somewhat; however, it is still clear that society at large paid heavily to enable education to continue.

Happily, this year vaccination has vastly reduced mortality, albeit set against very high case numbers. Although things will be more ‘normal’ this year, as a sector, we are still clearly deeply indebted to the rest of society and need to do all we can to minimise further impact.

The data – how bad was it?

Early in the summer of 2020 I estimated that the potential impact of autumn University return would be to at least double the number of Covid cases unless major action was taken to mitigate the risks.  Based on figures for the first wave and projections for 2020-2021 winter, I put the figure at around 50,000 deaths.

At the time this was derided as heavily pessimistic, but of course within months SAGE modelling estimates came out with far higher figures.  SAGE’s  “Summary of the effectiveness and harms of different non-pharmaceutical interventions, 21 September 2020” estimated that without substantial mitigation, university return in 2020 would lead to an increase in R of between 0.2 and 0.5, which corresponds to not just double, but between eight to sixty times as many cases over the first term.

This was all based on modelling, but the impact was evident in actual case data as Universities returned. This was particularly clear in Scotland as universities returned in mid-September where there was an almost instant doubling of infections in the university age group, which then fed into other cohorts over the succeeding weeks.

As well as more local measures, the Universities Scotland issued guidance for the weekend of 25-27 Sept 2020 asking students to avoid socialising outside their households and avoiding bars and other such venues.

In the rest of the UK the data was a little less clear as university return dates are more staggered, but there was a clear step change at the beginning of October 2020.

In Newcastle the local newspaper analysed national data and found that areas with high student density had Covid rates five times higher than areas with few students.  More anecdotally, we will all remember the images of students’ messages on their windows as halls went into effective lock-in, and the (rapidly removed) fencing around Manchester halls of residence.

This initial surge was due to the combination of simply lots of people coming together and establishing new contact networks, a known Covid risk, and the more obvious effect of start-of-term parties and ‘freshers week’ high spirits.

It is far harder to assess more long-term impacts during the year, as this simply added to the general societal growth.  Modelling can be used to attempt to disentangle these effects, but it is difficult to definitively separate effects of coupled dynamic systems  except during periods of sudden change.  There were noticeable end-of-year spikes in student areas of Leeds reported in June, but that, like the year start, was more about end of term parties, not the general effect of increased contact networks.

Mitigations – it could have been worse

SAGE’s figures, like my own, were for University return without mitigations. and they suggested potential actions to reduce the impact, some of which were headed.

Every university made very strong efforts to reduce spread within teaching environments, whilst still offering levels of in-person activities, but it was, and still is, the social side of student life that was expected to be most problematic.

Anticipating the mixing during Freshers Week, my own University and I know many others, created outdoor bars and activities in order to create spaces that were safer and less likely to lead to cross infection.  This was effective in that the majority of traced ‘superspreader’-style outbreaks seemed to be related to off-campus parties or events.

Students also took matters into their own hands.  For every highly publicised case of wild parties and ignoring of Covid rules, I heard other less highly published accounts of students effectively permanently isolating themselves in their rooms.  I also know of universities where courses that started off in hybrid mode with a mix of in-person and remote activities ended up abandoning the in-person elements as students effectively voted with their feet. I think this was principally the case for universities with a large number of local students, but also some students simply returned home and completed their studies remotely.

But students are young, so not at risk

One of the difficulties when thinking both about universities and schools, is that Covid is not particularly dangerous for those in their teens and twenties.  This is not to say no risk for pupils and students, especially for anyone with other health problems.  There is of course more risk for academics and teachers, and even more other staff such as cleaners, security and catering, who typically have older demographics than teachers and academics, but still the risk for working age adults was always smaller.

The biggest problem was, and still is, the spread into the community as a whole.  The Scottish data for last autumn showed this indeed did happen within weeks.  This is partly due to out-of-house contacts such as buses and shops, and partly due to home visits (for away-from-home students) and local students living at home.

These contacts then seed others and these indirect contacts, contacts of contacts, etc. far exceed the number of initial cases, and furthermore ended up spread over all demographics of society including the most vulnerable.  When the disease is near static (R ~ 0.9–1.1) this leads to around 10 additional cases for each initial case over a 2-3 month window, higher during times of higher growth.  While universities actively published the number of actual student and staff cases, these were the relatively safe tip of a far more deadly iceberg.

Last year, before the vaccine and new variants, these knock-on infections meant that each preventable infection would have a one in ten chance of causing an eventual death (see “More than R – how we underestimate the impact of Covid-19 infection” for the details of this figure).  At our current mid-vaccine stage, but with delta, the figure is about one in fifty – still far higher than any of the common risks we impose upon one another such as car driving, second-hand smoking or general pollution.

What about variants?

While the data suggests that at least half of the cases during the autumn of 2020 were due to university returns, the original Covid variant was overtaken first of all by alpha variant and then by delta variant.  There is thus an argument that only the deaths due to the original variant be counted, that is perhaps 10,000 deaths rather than 40-50,000.

For the delta variant this is undoubtedly the case; it quickly overcame the original variant and so the number of cases before the delta variant emerged are largely irrelevant to those that came after.  However, delta only emerged in the UK as the second wave decayed and after the majority of deaths, so it makes little difference to the overall tally.

Alpha is more complex.  Nearly all second wave deaths were due to alpha, and these constitute the larger part of winter 2020–2021 Covid deaths.

It is almost certain that alpha developed in the UK.  It could be that it developed in a person who would have been infected anyway irrespective of the universities.  If so then only around a half of pre-Christmas deaths should be attributed to the universities. However, if it developed as a mutation in someone who would not have been otherwise infected, not only all of the alpha variant UK deaths, but also all alpha variant deaths worldwide would land at our doorstep.

There is no way of knowing, but the odds as to which of these is the case run exactly with the proportion of cases due to the universities, so the best estimate is still to count that proportion of UK deaths and in principle a proportion of worldwide alpha-variant deaths also, but I don’t have the heart to calculate that figure, only knowing it is a lot, lot higher.

Why not blame schools?

Arguably, it is unfair to pin the increase entirely on the universities.

According to the SAGE estimates in Sept 2020, the two largest potential drivers of Covid were schools and universities.  Each were expected to lead to increases in R of 0.2 to 0.5. That is, if universities had returned but schools not reopened, while the universities would have still doubled the number of cases, this would have doubling a smaller number.  Given both schools and universities have similar figures then maybe it would be more fair to divide the combined impact between them, leading to maybe 3/8 of cases being assigned to each rather than half the cases to the universities.

This is a tenable argument, and indeed it is always hard to apportion blame or cost when faced with multiple causes that lead to non-linear effects.

Personally, I discount this.  First because it doesn’t make so much difference, 3/8 of a big number is still very large.  Second there were far stronger arguments for reopening schools: (i) because being more local to start with it was easier to mitigate their impact; (ii) because school children are younger it is harder for them to cope with remote learning, and (iii) because reopening schools freed up parents from childcare allowing other sectors of the economy to recover.  However, if you disagree knock a quarter off all of the figures for the impact of universities.

Maybe not so bad – lockdowns and government policy

Finally, while the bald figure of one death for every 50 to 100 students educated is frighteningly large, there is I think there is a good argument to reduce this substantially, albeit opening up the issue of wider non-mortality costs for society.

Last autumn Covid cases were increasing rapidly and the UK government was set against any further control measures.  Eventually it was forced to instigate a November lockdown across England after the earlier Wales ‘firebreak’.  The trigger for this was not the cases per se, but the danger of overwhelming the NHS ability to cope.

Those on the front-line of the NHS would debate how close we got to breakdown, and indeed whether in many ways we went beyond it.  However, crucially the driver of policy has been not Covid cases as such, nor even Covid deaths, but the number of hospital and especially intensive care admissions.

If Covid cases had been only half as high, there might not have been a pre-alpha lockdown at all before Christmas, or if there had been it would have been later as would the January lockdown.

By this argument, which I believe is a sound one, the impact of last year’s universities reopening was to accelerate growth, leading to earlier and longer lockdowns.  The increase in university-attributable deaths would by this argument still not be negligible, but lower, maybe less than 10,000 (about one for every 250 students educated).  However, this is then offset against the additional strain put on the rest of society, not least on the jobs of the other 50% of 18-21 year olds who don’t go to university.

In summary

First of all, it should be noted that there will be a further hit as universities return now, and a recent Times Higher survey reported that more than half of lecturers had serious concerns about the new term. However, the corresponding figures for this year will be an order of magnitude lower.  This does not mean we should not take every precaution possible, Covid deaths are still at levels that would be inconceivable if we hadn’t seen them so much higher previously.  At the time of writing, there are as many deaths due to Covid in two weeks as a whole year’s worth of road deaths.

As is probably evident, certainly from previous writing about the issue, I believe the decision to reopen the HE sector in Autumn 2020 was fundamentally wrong.  As I have previously argued, the universities’ hands were largely tied, as were to a lesser extent the devolved governments, by decisions taken at Westminster.  I assume that these decisions were partly party political (not wanting to alienate half of first-time voters) and partly financial (reducing the need to prop up the HE sector groaning under the increased costs of dealing with remote teaching).

The result of this was a worst of all possible worlds: bad for students who often ended up paying for semi-useless accommodation and being taught remotely during lockdowns anyway; bad for lecturers trying to cope with mixed models of teaching and the uncertainty of constantly switching of models; and bad for society deepening both the health and economic crisis.

Possibly saying that the universities’ hands were tied by government and that in turn as an employee of the university I was just continuing to do my job is a version of the concentration-camp guard excuse.  Personally I feel the weight of this: I knew what was unfolding, I had written about it, but could I have done more to raise the issue?

Looking forward we can still make a difference.

I’m part of the Not-Equal research network focused on issues social justice in the digital economy.  We are coming to the end of our funded period and had originally hoped to have an in-person end-of-project event bringing together the many academics and third-sector stake-holders who have been part of the network to share experiences and maybe create new partnerships looking forward.  During the summer, after consulting with our advisory board, we unanimously decided to instead have a purely virtual event.  Meeting together would have clearly had great advantages, but it felt that holding such an event, however worthy would be irresponsible.

Each such decision only makes a small difference, but it is the tens of thousands of such small acts that make a big difference.  This has been one of the hard to comprehend lessons of Covid, but one that will continue to be important as we shift our focus back towards other massive issues of poverty, social injustice, climate change and the myriad diseases other than Covid that plague so many in the world.

Busy September – talks, tutorials and an ultra-marathon

September has been a full month!

During the last two weeks things have started to kick back into action, with the normal rounds of meetings and induction week for new students.  For the latter I’d pre-recorded a video welcome, so my involvement during the week was negligible.  However, in addition I delivered a “Statistics for HCI” day course organised by the BCS Interaction Group with PhD students from across the globe and also a talk “Designing User Interactions with AI: Servant, Master or Symbiosis” at the AI Summit London.  I was also very pleased to be part of the “60 faces of IFIP” campaign by the International Federation for Information Processing.

It was the first two weeks that stood out though, as I was back on Tiree for two whole weeks.  Not 100% holiday as during the stay I gave two virtual keynotes: “Qualitative–Quantitative Reasoning: thinking informally about formal things” at the International Colloquium on Theoretical Aspects of Computing (ICTAC) in Kazakhstan and “Acting out of the Box” at the University of Wales Trinity St David (UWTSD) Postgraduate Summer School.  I also gave a couple of lectures on “Modelling interactions: digital and physical” at the ICTAC School which ran just before the conference and presented a paper on “Interface Engineering for UX Professionals” in the Workshop on HCI Engineering Education (HCI-E2) at INTERACT 2021 in Bari.  Amazing how easy it is to tour the world from a little glamping pod on a remote Scottish Island.

Of course the high point was not the talks and meetings, but the annual Tiree Ultra-marathon.  I’d missed last year, so especially wonderful to be back: thirty five miles of coastline, fourteen beaches, not to mention so many friendly faces, old friends and new.  Odd of course with Covid zero-contact and social distancing – the usual excited press of bodies at the pre-race briefing in An Talla, the Tiree community hall, replaced with a video webinar and all a little more widely spaced for the start on the beach too.

The course was slightly different too, anti-clockwise and starting half way along Gott Bay, the longest beach.  Gott Bay is usually towards the end of the race, about 28 miles in, so the long run, often into the wind is one of the challenges of the race.  I recall in 2017 running the beach with 40 mile an hour head wind and stinging rain – I knew I’d be faster walking, but was determined to run every yard of beach..  Another runner came up behind me and walked in my shelter.  However, this year had its own sting in the tail with Ben Hynish, the highest point, at 26 miles in.

The first person was across the line in about four-and-a-quarter hours, the fastest time yet.  I was about five hours later!

This was my fifth time doing the ultra, but the hardest yet, maybe in part due to lockdown couch-potato-ness!  My normal training pattern is that about a month before the ultra I think, “yikes, I’ve not run for a year” and then rapidly build up the miles – not the recommended training regime!  This year I knew I wasn’t as fit as usual, so I did start in May … but then got a knee injury, then had to self-isolate … and then it was into the second-half of July; so about a month again.

Next year it will be different, I will keep running through the winter … hmm … well, time will tell!

The different September things all sound very disparate – and they are, but there are some threads and connections.

The first thread is largely motivational.

The UWTSD keynote was about the way we are not defined by the “kind of people” we think of ourselves as being, but by the things we do.  The talk used my walk around Wales in 2013 as the central example, but the ultra would have been just as pertinent.  Someone with my waistline is not who one would naturally think as being an ultramarathon runner – not that kind of person, but I did it.

However, I was not alone.  The ‘winners’ of the ultra are typically the rangy build one would expect of a long-distance runner, but beyond the front runners, there is something about the long distance that attracts a vast range of people of all ages, and all body shapes imaginable.  For many there are physical or mental health stories: relationship breakdowns, illnesses, that led them to running and through it they have found ways to believe in themselves again.  Post Covid this was even more marked: Will, who organises the ultra, said that many people burst into tears as they crossed the finish line, something he’d never seen before.

The other thread is about the mental tools we need to be a 21st century citizen.

The ICTAC keynote was about “Qualitative–Quantitative Reasoning”, which is my term for the largely informal understanding of numbers that is so important for both day-to-day and professional life, but is not part of formal education.  The big issues of our lives from Covid to Brexit to climate change need us to make sense of large-scale numerical or data-rich phenomena.  These often seem too complex to make sense of, yet are ones where we need to make appropriate choices in both our individual lives and political voices.  It is essential that we find ways to aid understanding in the public, press and politicians – including both educational resources and support tools.

The statistics course and my “Statistics for HCI” book are about precisely this issue – offering ways to make sense of often complex results of statistical analysis and obtain some of the ‘gut’ understanding that professional statisticians develop over many years.

My 60 faces of IFIP statement also follows this broad thread:

“Digital techology is now essential to being a citizen. The future of information processing is the future of everyone; so needs to be understood and shaped by all. Often ICT simply reinforces existing patterns, but technology is only useful if we can use it to radically reimagine a better world.


More information on different events

Tiree Ultra

Tiree Ultramarathon web page and Facebook Group

Paper: Interface Engineering for UX Professionals

HCI-E2: Workshop on HCI Engineering Education – for developers, designers and more, INTERACT 2021, Bari, Italy – August 31st, 2021. See more – paper and links

Summer School Lecturea: Modelling interactions: digital and physical

Lecture at ICTAC School 2021: 18th International Colloquium on Theoretical Aspects of Computing, Nazarbayev University, Nur-Sultan, Kazakhstan, 1st September 2021. See more – abstract and links

Talk: Designing User Interactions with AI: Servant, Master or Symbiosis

The AI Summit London, 22nd Sept. 2021. See moreabstract and links

Day Course: Statistics for HCI

BCS Interaction Group One Day Course for PhD Students, 21st Sept. 2021.
See my Statistics for HCI Micro-site.

Keynote: Acting out of the Box

Rhaglen Ysgol Haf 2021 PCYDDS / UWTSD Postgraduate Summer School 2021, 10th Sept. 2021. See more – abstract and links

Keynote: Qualitative–Quantitative Reasoning: thinking informally about formal things

18th International Colloquium on Theoretical Aspects of Computing, Nazarbayev University, Nur-Sultan, Kazakhstan, 10th Sept. 2021. See more – full paper and links

Induction week greeting

 

The big stories buried beneath the headlines

In news stories this morning about pet abduction and sustainable fashion, the most critical parts are buried deep in the article: a chance remark that gives away the bigger story.

During the lockdown there has been a steep rise in the cost of dogs and other pets, and this has led to an increase in pet abductions. The most high profile example was when Lady Gaga’s dog walker was shot during the theft of her bulldogs in Los Angeles, but the BBC reports that there are over 2000 pet thefts in the UK alone last year.

Stock image of a person stealing a dog

Pet abduction to be made new criminal offence in thefts crackdown – BBC News

In principle pet theft is a crime covered by the UK Theft Act, but the use of this evidently does not reflect the emotional harms of pet abductions, hence the need for the new law. Reading further the article says:

Although offences under the Theft Act 1968 carry a maximum term of seven years, ministers say there is little evidence of that being used, because the severity of the sentence is partly determined by the monetary value of the item taken.

It was this that caught my eye.  The most severe penalties under the Theft Act are for the most valuable items.  If the second-hand car of a pensioner near the poverty line is stolen, it will attract a less severe sentence than the trophy Porsche from the millionaire’s collection.  This sounds like a law made in the 17th century, but is in fact from 1968 and applies today.

The lesson is clear, if you are poor then even the criminal law does nothing for you.

The second story is about Molly-Mae, ex-Love Island contestant and social media influencer, who has just been recruited as creative director of Pretty Little Things with a particular focus on sustainable fashion.

 

Molly Mae

Molly-Mae: “I’m not just an influencer anymore”

Reading further there is a section entitled “Wearing the same dress twice”, that has the following quote from Molly-Mae:

“I even captioned one of my Instagram pictures the other day saying ‘PSA it’s ok to wear the same dress twice’ – it’s a bad habit us girls have got into, like if you put it on Instagram it means you can’t wear it again.”

Although I did know some of the figures for this before, it still shocked me to hear that “wearing the same dress twice” is regarded as a significant message.

Sadly, this does reflect the previous figures I’ve seen suggesting that the median number of times a garment is worn is indeed one, with something like 20% of clothes never worn at all once bought.  This all has to be added to around 1/3 of fashion clothing that is shredded or otherwise disposed of without ever being sold, due to end of season, returns, or other reasons.

The fashion industry is estimated to contribute 10% of all global carbon emissions, not to mention plastic micro-fibres, chemical, water and other environmental impacts, as well as being built upon near slave-labour conditions across the world.

Given this, even wearing clothes twice could be a major benefit.

However, just imagine how the statement sounds to someone who lived through the second world war, or even anyone over 50.  This is reflected in figures for environmental action by age group: awareness is greatest in the younger age groups, but in nearly all areas life-style action is greatest in the older ones.  Perhaps influencers such as Molly-Mae can help turn this round.

So as you read the news, do look beyond the headlines, the most hard-hitting parts are often buried deep.

dog digging

Image: jimbomack66, CC BY 2.0, via Wikimedia Commons

Darwinian markets and sub-optimal AI

Do free markets generate the best AI?  Not always, and this not only hits the bottom line, but comes with costs for personal privacy and the environment.  The combination of network effects and winner-takes-all advertising expenditure means that the resulting algorithms may be worst for everyone.

A few weeks ago I was talking with Maria Ferrario (Queens University Belfast) and Emily Winter (Lancaster University) regarding privacy and personal data.  Social media sites and other platforms are using ever more sophisticated algorithms to micro-target advertising.  However, Maria had recently read a paper suggesting that this had gone beyond the point of diminishing returns: far simpler  – and less personally intrusive – algorithms achieve nearly as good performance as the most complex AI.  As well as having an impact on privacy, this will also be contributing to the ever growing carbon impact of digital technology.

At first this seemed counter-intuitive.  While privacy and the environment may not be central concerns, surely companies will not invest more resources in algorithms than is necessary to maximise profit?

However, I then remembered the peacock tail.


Jatin Sindhu, CC BY-SA 4.0, via Wikimedia Commons
The peacock tail is a stunning wonder of nature.  However, in strict survival terms, it appears to be both flagrantly wasteful and positively dangerous – like eye-catching supermarket packaging for the passing predator.

The simple story of Darwinian selection suggests that this should never happen.  The peacocks that have smaller and less noticeable tails should have a better chance of survival, passing their genes to the next generation, leading over time to more manageable and less bright plumage.  In computational terms, evolution acts as a slow, but effective optimisation algorithm, moving a population ever closer to a perfect fit with its environment.

However, this simple story has twists, notably runaway sexual selection.  The story goes like this.  Many male birds develop brighter plumage during mating season so that they are more noticeable to females.  This carries a risk of being caught by a predator, but there is a balance between the risks of being eaten and the benefits of copulation.  Stronger, faster males are better able to fight off or escape a predator, and hence can afford to have slightly more gaudy plumage.  Thus, for the canny female, brighter plumage is a proxy indicator of a more fit potential mate.  As this becomes more firmly embedded into the female selection process, there is an arms race between males – those with less bright plumage will lose out to those with brighter plumage and hence die out.  The baseline constantly increases.

Similar things can happen in free markets, which are often likened to Darwinian competition.

Large platforms such as Facebook or Google make the majority of their income through advertising.  Companies with large advertising accounts are constantly seeking the best return on their marketing budgets and will place ads on the platform that offers the greatest impact (often measured by click-through) for the least expenditure.  Rather like mating, this is a winner-takes-all choice.  If Facebook’s advertising is 1% more effective than Google’s  a canny advertiser will place all their adverts with Facebook and vice versa.  Just like the peacock there is an existential need to outdo each other and thus almost no limit on the resources that should be squandered to gain that elusive edge.

In practice there are modifying factors; the differing demographics of platforms mean that one or other may be better for particular companies and also, perhaps most critically, the platform can adjust its pricing to reflect the effectiveness so that click-through-per-dollar is similar.

The latter is the way the hidden hand of the free market is supposed to operate to deliver ‘optimal’ productivity.  If spending 10% more on a process can improve productivity by 11% you will make the investment.  However, the theory of free markets (to the extent that it ever works) relies on an ‘ideal’ situation with perfect knowledge, free competition and low barriers to entry.  Many countries operate collusion and monopoly laws in pursuit of this ‘ideal’ market.

Digital technology does not work like this. 

For many application areas, network effects mean that emergent monopolies are almost inevitable.  This was first noticed for software such as Microsoft Office – if all my collaborators use Office then it is easier to share documents with them if I use Office also.  However, it becomes even more extreme with social networks – although there are different niches, it is hard to have multiple Facebooks, or at least to create a new one – the value of the platform is because all one’s friends use it.

For the large platforms this means that a substantial part of their expenditure is based on maintaining and growing this service (social network, search engine, etc.).  While the income is obtained from advertising, only a small proportion of the costs are developing and running the algorithms that micro-target adverts.

Let’s assume that the ratio of platform to advertising algorithm costs is 10:1 (I suspect it is a lot greater).  Now imagine platform P develops an algorithm that uses 50% more computational power, but improves advertising targeting effectiveness by 10%; at first this sounds a poor balance, but remember that 10:1 ratio.

The platform can charge 10% more whilst being competitive.   However, the 50% increase in advertising algorithm costs is just 5% of the overall company running costs, as 90% are effectively fixed costs of maintaining the platform.  A 5% increase in costs has led to a 10% increase in corporate income.  Indeed one could afford to double the computational costs for that 10% increase in performance and still maintain profitability.

Of course, the competing platforms will also work hard to develop ever more sophisticated (and typically privacy reducing and carbon producing) algorithms, so that those gains will be rapidly eroded, leading to the next step.

In the end there are diminishing returns for effective advertising: there are only so many eye-hours and dollars in users’ pockets. The 10% increase in advertising effectiveness is not a real productivity gain, but is about gaining a temporary increase in users’ attention, given the current state of competing platforms’ advertising effectiveness.

Looking at the system as a whole, more and more energy and expenditure are spent on algorithms that are ever more invasive of personal autonomy, and in the end yield no return for anyone.

And it’s not even a beautiful extravagance.

A brief history of array indices — making programs that fit people

A colleague recently said to me “As computer scientists, our index always starts with a 0“, and my immediate thought was “not when I was a lad“!
As well as revealing my age, this is an interesting reflection on the evolution of programming languages, and in particular the way that programming languages in some ways have regressed in terms of human-centredness expecting the human to think like a machine, rather than the machine doing the work.
But let’s start with array indices.  If you have programmed arrays in Java, Javascript, C++, PHP, or (lists in) Python they all have array indices starting at 0: a[0],,a[1], etc.  Potentially a little confusing for the new programmer, an array of size 5 therefore has last index 4 (five indices: 0,1,2,3,4).  Also code is therefore full of ‘length-1’
double values[] = codeReturningArray();
double first = values[0];
double last = values[values.length-1];
This feels so natural  we hardly notice we are doing it.  However, it wasn’t always like this …
The big three early programming languages were Fortran (for science), Algol (for mathematics and algorithms) and COBOL (for business). In all of these arrays/tables start at 1 by default (reflecting mathematical conventions for matrices and vectors), but both Fortran and Algol could take arbitrary ranges – the compiler did the work of converting these into memory addresses.
Another popular early programming language was BASIC created as a language for learners in 1964, and the arrays in the original Basic also started at 1.  However, for anyone learning Basic today, it is likely to be Microsoft Visual Basic used both for small business applications and also scripting office documents such as Excel.  Unlike the original Basic, the arrays in Visual Basic are zero based arrays ending one less than the array size (like C).  Looking further into the history of this, arrays in the first Microsoft Basic in 1980 (a long time before Wiindows) allowed 0 as a start index, but Dim A(10) meant there were 11 items in the array 0–10. This meant you could ignore the zero index if you wanted and use A(1..10) like in earlier BASIC, Fortran etc, but meaning the compiler had to do less work.

Excerpt from 1964 BASIC manual (download)
In both Pascal and Ada, arrays are more strongly typed, in that the programmer explicitly specifies the index range, not simply a size.  That is, it is possible to declare zero-based arrays A[0..9], one-based arrays A[1..7] or indeed anything else A[42..47].  However, illustrative examples of both Pascal arrays and Ada arrays typically have index types stating at 1 as this was consistent with earlier languages and also made more sense mathematically.
It should be noted that most of the popular early language also allowed matrices or multi-dimensional arrays,
Fortran: DIMENSION A(10,5)
Algol:   mode matrix = [1:3,1:3]real; 
Basic:   DIM B(15, 20)
Pascal:  array[1..15,1..10] of integer;
So, given the rich variety of single and multi-dimensional arrays, how is it that arrays now all start at zero?  Is this the result of deep algebraic or theoretical reflection by the computer science community?  In fact the answer is far more prosaic.
Most modern languages are directly or indirectly influenced by C or one of its offshoots (C++, Java, etc.), and these C-family languages all have zero indexed arrays because C does.
I think this comes originally from BCPL (which I used to code my A-level project at school) which led to B and then C.  Arrays in BCPL were pointer based (as in C) making no distinction between array and pointer.  BCPL treated an ‘array’ declaration as being memory allocation and ‘array access (array!index) as pointer arithmetic.  Hence the zero based array index sort of emerged.
This was all because the target applications of BCPL were low-level system code.  Indeed, BCPL was intended to be a ‘bootstrap’ language (I think the first language where the compiler was written in itself) enabling a new compiler to be rapidly deployed on a new architecture. BCPL (and later C) was never intended for high-level applications such as scientific or commercial calculations, hence the lack of non-zero based arrays and proper multi-dimensional arrays.
This is evident in other areas beyond arrays. I once gave a C-language course at one of the big financial institutions. I used mortgage calculation as an example.  However, the participants quickly pointed out that it was not a very impressive example, as native integers were just too small for penny-accurate calculations of larger mortgages.  Even now with a 64 bit architecture, you still need to use flexible-precision libraries for major financial calculations, which came ‘for free’ in COBOL where numbers were declared at whatever precision you wanted.
Looking back with a HCI hat on, it is a little sad to see the way that programming languages have regressed from being oriented towards human understanding with the machine doing the work to transform that into machine instructions, towards languages far more oriented towards the machine with the human doing the translation 🙁   
Maybe it is time to change the tide.

 

 

Visiting the lost land

Last Sunday I finally got to see Troedrhiwfuwch, or perhaps strictly the site of Troedrhiwfuwch.

Troedrhiwfuwch, known as Troedy locally, was once a thriving mining village in the Rhymney valley.  Then in the 1980s the whole village was condemned and abandoned due to fear of landslips and now only two houses and the war memorial still stand to mark this lost village.

However, whilst lost on the ground, the community are still active most of which only recall the village from early childhood, or were born later and know it only from the stories of parents, grandparents, and others from the community.  There is a vibrant Facebook memories group where photos and stories are shared and also several members have been collecting photos, newspaper accounts and trawling through censuses and war records.

I’ve been working with the community for a few months partly funded by a Cherish-DE Knowledge Exchange.  We’ve been looking at how digital technology could help both preserve the legacy and also share it with visitors.

A central part of this has centred around the war memorial, partly because it serves as a tangible marker of the village where many gather each year on Armistice Day, and partly because the village sent so many young men in proportion to its size, more than one per household, many of whom never returned.

Until now all of our meetings have been remote.  We shared photographs, prototypes and stories and jokes, but all through the little windows of Zoom.  However, last Sunday I travelled through the quiet towns of mid Wales, up through the Brecon Beacons the kerbs lined with the parked cars of families enjoying the Bank Holiday sunshine, then back down into the Rhymney Valley and the tranquility of Troedrhiwfuwch.

Liz, Carys and Vince, who I’d been working with and together with a few others from the community the local councillor Eluned Stenner was there and also Lisa the Armed Forces Regional Officer.  They had already set up a tea, cake and biscuit station, complete with generator for the kettle – a combination of Valley’s hospitality and Vince’s army background in as an engineer meant we were well sorted.

I said ‘the tranquility of Troedrhiwfuwch’, but beside the War Memorial itself, it is anything but as the main valley road runs only feet away.  One of the worries that the community has had for many years is that the crowd gathering for the Armistice Day act of remembrance did so at very real risk of their own lives.  As soon as she saw this Eluned promised to ensure the road was closed for the next Armistice Day.

However, the tea table and second table covered in copies of many of the historic documents collected over the years, were in the Memorial Gardens, just a stones throw from the road and yet a haven of peace.

The Memorial Gardens are on the site of St Teilo’s Church, which also torn down with the rest of the village in the 1980s, although all the contents of the inside of the church have been preserved in a side chapel in St Tyfaelog’s at Pontlottyn.

The aim is to plant a shrub with a memorial plaque for each of the war dead, several of which have no other grave, to both act as a location for their families to visit and also as a memorial more broadly for the village.

On the table of resources you can see a few mock-ups of plaques with QR codes that link to information about each person.  I’ve helped the community create these we plan make these contextual, so that, for example, a school group visiting can have information tailored to their age and curriculum.

 

The discussions with Eluned and Lisa suggested various funds that the community could apply for to enable work on the gardens and refurbishing the war memorial.  The community is also named in a substantial research proposal that Swansea University submitted in March that also included St Fagans and partners in Cork – so croesi bysedd for that!

Irrespective of particular grants (although that will help!), we will continue to work together.  As the outsider the lost village is a fascinating story, and I am constantly amazed at the knowledge, enthusiasm, and dedication of the community team working on the Troedrhiwfuwch archives.  With my technologist/researcher hat on, I’m also thinking about the potential digital tools and methods that could enable other communities to more easily preserve and share their own memories and stories.

In terms of digital technology, the next steps will include more ways to help link the digital archives to the physical location, including geocoding pictures, rather like the Nesta-funded Frasan app I was involved wit on Tiree some years back. As well the links to the wordd wars, the village is connected to human stories of industrial change, migration, sport, not to mention the geological features underlying the ‘moving mountain’, which eventually caused its demise.

In addition, there is less visible, but perhaps in the long term more critical, ‘back stage’ work in helping to connect and annotate the various photos and documents in the archive — linking stories to the objects.  Although the domains are rather different, I expect this aspect to intersect with work on democratising digitisation in the AHRC InterMusE project and also connect to other disciplines across Swansea University.

For more about Troedrhiwfuwch:

 

 

 

Online 1882 Gazetteer of Scotland


In the late 2000s, not long after moving to Tiree, I came across John Wilson’s 1882 Gazetteer of Scottish place names at the Internet Archive and thought it would be a lovely if it were properly usable as an online resource.

For various reasons I never finished at the time, but over Easter I returned to the project and now have a full online version available, browsable page by page or entry by entry.  There is more work to be done to make it really usable, but it is a beginning.

I’m using this and other digitisation projects as ways to understand the kinds of workflows and tools to help others create their own digital resources based on archive materials.  In the InterMusE project, recently funded by AHRC, we are working with local musical societies to help them digitise their historic concert programs and other documents.

 

The day the Archbishop of Canterbury made me swear

My strongest language usually rises to ‘crumbs‘, or once, when I slipped on stone steps and landed on my back, “oh my goodness“.

Only twice in recent times can I recall swearing out loud (or in print), once was early in the year as I was driving into work and heard that the government were putting off taking action on Covid and thus deliberately sacrificing the lives of so many people; the other was yesterday. It has only been rarely, and I know that in the latter case it is through ignorance not ill-will, that I have faced deep evil so directly.

I was in tears and despair at the utter irresponsibility of the letter by the UK faith leaders to the Prime Minister.  On the day that the US is hovering on the edge of giving Trump a second term, it is hard to find anything shocking anymore, but this shook me to the core and made me ashamed to be a Christian.

The letter appears so reasonable, echoing the words of pubs, gyms, and so many that are struggling with the implications of further lockdown.  Everyone has a good reason why the rules shouldn’t apply to them. However faith leaders should be giving moral leadership; a lived example against craven self interest.

It is worth reading the letter in full.

Covid secure

It starts with a common misconception: “Public Worship is covid-19 secure“.  This notion of being ‘covid safe’ is similar to those in universities and every area of commercial and public life.  Of course none of the measures we take are utterly ‘safe’, but each reduces risk to levels that within particular levels of disease prevalence are acceptable.  Also, like many other aspects of life, it will not be the actual structured worship itself where there are risks, but the coming and going, the greeting and chats that linger a little too long and too close. The hymn singing in my tweet is of course hyberbole; singing was identified as being a critical feature in worship super-spreader events and so where there is singing in worship, it is now confined to a small well-spaced choir with at most hum-along.

However, the same is true of many other areas, of life.  In the SAGE summary of possible non-pharmaceutical interventions, only four measures have substantial effect:

measure reduction in R value
Stay at home order (“lockdown”) 75% reduction
Work from home wherever possible 0.2–0.4
Mass school closure 0.2–0.5
Closure of Higher Education 0.2–0.5

Everything else is of low and uncertain impact, it is only the accumulation of many factors together, each of which has small, maybe minimal effect, which together can make a difference.  The lockdown, both the shorter firebreak in Wales and the longer lockdown in England will have hard and in some cases terrible impacts on many people.  Each setting: pubs and restaurants, tourism, retail, gyms, non-school sport, have made every effort to be as ‘covid safe’ as possible.

What are the faith leaders suggesting here, that religious worship should be free while everyone else suffers together?

This is a multi-faith letter, and I cannot speak for other faiths, but the heart of Christian tradition is the God who does not merely sympathise or alleviate sufferings, but enters into that suffering; the God sits alongside us, who feels the pain with us.  This is the essence of the baby born at Christmas and the Christ who not only died for us, but lived alongside us.

The way of Jesus is to not to seek ways to bypass or be given a free pass.  For the Church, closing alongside the shop, the pub and children’s sport is not a problem or inconvenience but a sacred duty.

Sustain our service

The letter describes how “Faith communities have been central to the pandemic response” including foodbanks and volunteering.  This should be the most uplifting section, but is most disturbing.

I think it is trying to say that this level of service cannot be maintained by people without the support offered by shared physical worship. In some ways this is similar to national support of the NHS, recognising that those who are doing much to help others need support themselves.

Buy, if you dig deeper what is the implication here?  Maybe some weekly pot banging for the churches, mosques and synagogues?  I’m sure it would help the mental health of nurses who are in PPE all day long to be able to party together without masks.  Many others who volunteer might find their solace in a pint or cup of coffee with friends.  Are all of these acceptable, or only the act of worship?

What makes the faith community so different?  It would be nice to say the faith itself and God who sustains, but this letter suggests that instead they need more external support than everyone else.

Most worrying is “Without the worshipping community, our social action and support cannot be energised and sustained indefinitely“.  There is a fine line between a warning or prediction and a threat.  I assume that this was never intended to be the latter, but placed close to the beginning of the letter and directly below the heading “Public Worship is Essential to sustain our service” and the description of that service (maybe time to re-read Luke 12), the overtones are concerning.

Social cohesion and connectedness and “the Mental Health of our nation”

The next two sections of the letter focus on mental health and well being (and yes, the capitalisation is that of the letter).

Across the country elderly people in care homes can’t see their families, Those with sick relatives have to constantly weigh the risks of visiting, with the possibility they may never see them again.  In so many jobs not just the obvious front-line ones, people put themselves in danger. Not just this but there are those avoiding going to hospital (in Wales we’ve had one of the larger recent hospital outbreaks, 99 deaths in Cwm Taf Morgannwg) and there are so many facing financial or personal ruin due to lockdowns.

This is all not just because of Covid itself, but because we cannot collectively gather the self-discipline and basic compassion that prioritises the most needy.

The lockdowns and other restrictions are there to help prevent or reduce these impacts on life and well-being. We live within these measures for the sake of others.

The letter from the faith ‘leaders’ (and it sticks in my throat to use the word), undermines this message of self-sacrifice and in so doing is not only shameful, but compromises the moral integrity of the nation.

It is marked that on the same day that this letter was published I retweeted a statement from Mark Drakeford, the First Minister of Wales (who maybe should be known as St Mark):

His straightforward approach and appeal to the best natures of the nation is in such sharp contrast to those who carry the name of “faith leaders”.  Again and again, when asked about this minor edge of legislation or some possible loophole he takes the questioner back to the basic principle that we do not seek to do the most we can within the rules, but to do the most we can for others.  I am reminded of the Gospels.and Jesus responses to the legal penny pinching of the Pharisees, the “faith leaders” of their time,

Signs of hope

This is so important and yet the paragraph goes on to say “faith communities who consistently embody behaviours and attitudes that are covid-19 safe and hopeful provide encouragement to others through modelling these behaviours and attitudes“, while advocating that places of worship should be given carte-blanche to continue while other areas that have also endeavoured to be “covid-19 safe” have to close down.  This is indeed creating a model for others of “attitudes and behaviours” that are self-interested and tone deaf to the sufferings of society around.

The section goes on to say “Public worship is therefore an essential sign that we can find new ways of living with Covid-19 until the vaccine is found, and part of the psychological and social cohesion needed to exit restriction measures“, and indeed this is the case.

Stories I’ve heard from those running online services through lockdown (and not ones with young congregations!) is that they are seeing people on Zoom who had never been in church, and that when they were able to open again there were few in the building that had not been attending online. So many people find going into a religious building hard. For some this is about having too many people around, or being able to keep quiet long enough. For others the fear of doing something wrong; hardest in more unstructured settings such as a free church or Quaker meeting where the rules are unwritten compared to where you just follow the service book. Being able to join from the security of one’s home has opened the doors to more people than any evangelist!

If we ignore the vapid pleadings of exceptionalism, the headings of this letter could read as a set of challenges for the faith communities in the Time of Covid.  Can we cut behind the curtain of the sanctuary and ask what physical public worship is really about?  This letter forms a strong basis for that.

If instead of saying,”we should be an exception because of this“; why not ask, “how can we do this even with the restrictions upon us?“, or even “how can we use this time to find a fresh understanding of the things that are really core to our faith and our being?“.


Image attribution: Yellow vector created by starline – www.freepik.com

Fact checking Full Fact

It is hard to create accurate stories about numerical data.

Note: Even as I wrote this blog events have overtaken us.  The blog is principally about analysing how fact checking can go wrong; this will continue to be an issue, so remains relevant.  But it is also about the specific issues with FullFact.org’s discussion of the community deaths that emerged from my own modelling of university returns.  Since Full Fact’s report a new Bristol model has been published which confirms the broad patterns of my work and university cases are already growing across the UK (e.g. LIverpool,Edinburgh) with lockdowns in an increasing number of student halls (e.g. Dundee)).
It is of course nice to be able to say “I was right all along“, but in this case I wish I had been wrong.

A problem I’ve been aware of for some time is how difficult many media organisations have in formulating evidence and arguments, especially those involving numerical data.  Sometimes this is due to deliberately ‘spinning’ an issue, that is the aim is distortion.  However, at other times, in particular fact checking sites, it is clear that the intention is offer the best information, but something goes wrong.

This is an important challenge for my own academic community, we clearly need to create better tools to help media and the general public understand numerical arguments.  This is particularly important for Covid and I’ve talked and written elsewhere about this challenge.

Normally I’ve written about this at a distance, looking at news items that concern other people, but over the last month I’ve found  myself on the wrong side of media misinterpretation or maybe misinformation.  The thing that is both most fascinating (with an academic hat on) and also most concerning is the failure in the fact-checking media’s ability to create reasoned argument.

This would merely be an interesting academic case study, were it not that the actions of the media put lives at risk.

I’ve tried to write succinctly, but what follows is still quite long.  To summarise I’m a great fan of fact checking sites such as Full Fact, but I wish that fact checking sites would:

  • clearly state what they are intending to check: a fact, data, statement, the implicit implications of the statement, or a particular interpretation of a statement.
  • where possible present concrete evidence or explicit arguments, rather than implicit statements or innuendo; or, if it is appropriate to express belief in one source rather than another do this explicitly with reasons.

However, I also realise how I need better ways to communicate my own work both numerical aspects, but also textually.  I realise that often behind every sentence, rather like an iceberg, there is substantial additional evidence or discussion points.

Context

I’d been contacted by Fullfact.org at the end of August in relation to the ‘50,000 deaths due to universities’ estimate that was analysed by WonkHE and then tweeted by UCU.  This was just before the work was briefly discussed on Radio 4’s More or Less … without any prior consultation or right of reply.  So full marks to Full Fact for actually contacting the primary source!

I gave the Full Fact journalist quite extensive answers including additional data.  However, he said that assessing the assumptions was “above his pay grade” and so, when I heard no more, I’d assumed that they had decided to abandon writing about it.

Last week on a whim, just before gong on holiday, I thought to check and discovered that Fullfact.org had indeed published the story on 4th September, indeed it still has pride of place on their home page!

Sadly, they had neglected to tell me when it was published.

Front page summary – the claim

First of all let’s look at the pull out quote on the home page (as of 22nd Sept).

At the top the banner says “What was claimed”, appearing to quote from a UCU tweet and says (in quote marks):

The return to universities could cause 50,000 deaths from Covid-19 without “strong controls”

This is a slight (but critical) paraphrase of the actual UCU tweet which quoted my own paper::

“Without strong controls, the return to universities would cause a minimum of 50,000 deaths.”

The addition of “from Covid-19” is filling in context.  Pedantically (but important for a fact checking site), by normal convention this would be set in some way to make clear it is an insertion into the original text, for example [from Covid-19].  More critically, the paraphrase inverts the sentence, thus making the conditional less easy to read, replaces “would cause a minimum” with “could cause”. and sets “strong controls” in scare quotes.

While the inversion does not change the logic, it does change the emphasis.  In my own paper and UCU’s tweet the focus on the need for strong controls comes first, followed by the implications if this is not done; whereas in the rewritten quote the conditional “without strong controls” appears more like an afterthought.

On the full page this paraphrase is still set as the claim, but the text also includes the original quote.  I have no idea why they chose to rephrase what was a simple statement to start with.

Front page summary – the verdict

It appears that the large text labelled ‘OUR VERDICT’ is intended to be a partial refutation of the original quote:

The article’s author told us the predicted death toll “will not actually happen in its entirety” because it would trigger a local or national lockdown once it became clear what was happening.

This is indeed what I said!  But I am still struggling to understand by what stretch of the imagination a national lockdown could be considered anything but “strong controls“.  However, while this is not a rational argument, it is a rhetorical one, emotionally what appears to be negative statement “will not actually happenfeels as though it weakens the original statement, even though it is perfectly consonant with it.

One of the things psychologists have known for a long time is that as humans we find it hard to reason with conditional rules (if–then) if they are either abstract or disagree with one’s intuition.  This lies at the heart of many classic psychological experiments such as the Wason card test.   Fifty thousand deaths solely due to universities is hard to believe, just like the original Covid projections were back in January and February, and so we find it hard to reason clearly.

In a more day-to-day example this is clear.

Imagine a parent says to their child, “if you’re not careful you’ll break half the plates

The chid replies, “but I am being careful”.

While this is in a way a fair response to the implied rider “... and you’re not being careful enough“, it is not an argument against the parent’s original statement.

When you turn to the actual Full Fact article this difficulty of reasoning becomes even more clear.  There are various arguments posed, but none that actually challenge the basic facts, more statements that are of an emotional rhetorical nature … just like the child’s response.

In fact if Full Fact’s conclusion had been “yes this is true, but we believe the current controls are strong enough so it is irrelevant“, then one might disagree with their opinion , but it would be a coherent argument.  However, this is NOT what the site claims, certainly in its headline statements.

A lack of alternative facts

To be fair to Full Fact the most obvious way to check this estimated figure would have been to look at other models of university return and compare it with them.  It is clear such models exist as SAGE describes discussions involving such models, but neither SAGE nor indie-Sage‘s reports on university return include any estimated figure for overall impact.  My guess is that all such models end up with similar levels to those reported here and that the modellers feel that they are simply too large to be believable … as indeed I did when I first saw the outcomes of my own modelling..

Between my own first modelling in June and writing the preprint article there was a draft report from a three day virtual study group of mathematicians looking at University return, but other than this I was not aware of work in the public domain at the time. For this very reason, my paper ends with a call “for more detailed modelling“.

Happily, in the last two weeks two pre-print papers have come from the modelling group at Bristol, one with a rapid review of University Covid models and one on their own model.  Jim Dickinson has produced another of his clear summaries of them both.  The Bristol model is far more complex than those that I used including multiple types of teaching situation and many different kinds of students based on demographic and real social contact data.  It doesn’t include student–non-student infections, which I found critical in spread between households, but does include stronger effects for in-class contagion.  While very different types of modelling, the large-scale results of both suggest rapid spread within the student body.  The Bristol paper ends with a warning about potential spread to the local community, but does not attempt to quantify this, due the paucity of data on student–non-student interactions.

Crucially, the lack of systematic asymptomatic testing will also make it hard to assess the level of Covid spread within the student population during the coming autumn and also hard to retrospectively assess the extent to which this was a critical factor in the winter Covid spread in the wider population.  We may come to this point in January and still not have real data.

Full page headlines

Following through to the full page on Full Fact, the paraphrased ‘claim’ is repeated with Full Fact’s ‘conclusion’ … which is completely different from the front page ‘OUR VERDICT’.

The ‘conclusion’ is carefully stated – rather like Boris Johnson’s careful use of the term ‘controlled by’ when describing the £350 million figure on the Brexit bus.  It does not say here whether Full Fact believes the (paraphrased) claim, but they merely make a statement relating to it.  In fact at the end of the article there is rather more direct conclusion berating UCU for tweeting the figure.  That is Full Fact do have a strong conclusion, and one that is far more directly related to the reason for fact checking this in the first place, but instead of stating this explicitly, the top of page headline ‘conclusion’ in some sense sits on the fence.

However, even this ‘sit on the fence’ statement is at very least grossly misleading and in reality manifestly false.

The first sentence:

This comes from a research paper that has not been peer-reviewed

is correct, and one of the first things I pointed out when Full Fact contacted me.  Although the basic mathematics was read by a colleague, the paper itself has not been through formal peer review, and given the pace of change will need to be changed to be retrospective before it will be.  This said, in my youth I was a medal winner in the International Mathematical Olympiad and I completed my Cambridge mathematics degree in two years; so I do feel somewhat confident in the mathematics itself!  However, one of the reasons for putting the paper on the preprint site arXiv was to make it available for critique and further examination.

The second statement is not correct.  The ‘conclusion’ states that

It is based on several assumptions, including that every student gets infected, and nothing is done to stop it.

IF you read the word “it” to refer to the specific calculation of 50,000 deaths then this is perhaps debatable.  However, the most natural reading is that “it” refers to the paper itself, and this interpretation is reinforced later in the Full Fact text, which says “the article [as in my paper] assumes …”.  This statement is manifestly false.

The paper as a whole models student bubbles of different sizes, and assumes precisely the opposite, that is assuming rapid spread only within bubbles.  That is it explicitly assumes that something (bubbles) is done to stop it. The outcome of the models, taking a wide range of scenarios, is that in most circumstances indirect infections (to the general population and back) led to all susceptible students being infected.  One can debate the utility or accuracy of the models, but crucially “every student gets infected” is a conclusion not an assumption of the models or the paper as a whole.

To be fair on Full Fact this confusion between the fundamental assumptions of the paper and the specific values used for this one calculation is echoing Kit Yates initial statements when he appeared on More or Less. I’m still not sure whether that was a fundamental misunderstanding or a slip of the tongue during the interview and my attempts to obtain clarification have failed.  However, I did explicitly point this distinction out to Full Fact.

The argument

The Full Fact text consists of two main parts.  One is labelled “Where did “50,000 deaths” come from?”, which is ostensibly a summary of my paper, but in reality seems to be where there are the clearest fact-check style statements.  The second is labelled “But will this happen?” which sounds as if this is the critique.  However, it is actually three short paragraphs the first two effectively setting me and Kit Yates head-to-head and the third is the real conclusion which says that UCU tweeted the quote without context.

Oddly I was never asked whether I believed that the UCU’s use of the statement was consistent with the way in which it was derived in my work.  This does seem a critical question given that Full Fact’s final conclusion is that UCU quoted it out of context. Indeed, while the Full Fact claims that UCU tweeted “the quote without context“, within the length of a tweet the UCU both included the full quote (not paraphrased!) and directly referenced Jim Dickinson’s summary of my paper on WonkHE, which itself links to my paper.  That is the UCU tweet backed up the statement with links that lead to primary data and sources.

As noted the actual reasoning is odd as the body of the argument, to the extent it exists, appears to be in the section that summarises the paper.

First section – summary of paper

The first section “Where did “50,000 deaths” come from?”, starts off by summarising the assumptions underlying the 50,000 figure being fact checked and is the only section that links to any additional external sources.  Given the slightly askance way it is framed, it is hard to be sure, but it appears that this description is intended to cast doubt on the calculations because of the extent of the assumptions.  This is critical as it is the assumptions which Kit Yates challenged.

In several cases the assumptions stated are not what is said in the paper.  For example, Full Fact says the paper “assumes no effect from other measures already in place, like the Test and Trace system or local lockdowns” whereas the paragraph directly above the crucial calculation explicitly says that (in order to obtain a conservative estimate) the initial calculation will optimistically assume “social distancing plus track and trace can keep the general population R below 1 during this period“.  The 50,000 figure does not include additional more extensive track and trace within the student community, but so far this is no sign of this happening beyond one or two universities adopting their own testing, and this is precisely one of the ‘strong controls’ that the paper explicitly suggests.

Ignoring these clear errors, the summary of assumptions made by the calculation of the 50,000 figure says that I “include the types of hygiene and social distancing measures already being planned, but not stronger controls” and then goes on to list the things not included. It does seem obvious and is axiomatic that a calculation of what will happen “without strong controls” must assume for the purposes of the calculation that there are no strong controls.

The summary section also spends time on the general population R value of 0.7used in the calculation and the implications of this.  The paragraph starts “In addition to this” and quotes that this is my “most optimistic” figure. This is perfectly accurate … but the wording seems to imply this is perhaps (another!) unreasonable assumption … and indeed it is crazily low.  At the time (soon after lockdown) it was still hoped that non-draconian measures (such as track and trace) could keep R below 1, but of course we have seen rises far beyond this and the best estimates for coming winter are now more like 1.2 to 1.5.

Note however the statement was “Without strong controls, the return to universities would cause a minimum of 50,000 deaths.”  That is the calculation was deliberately taking some mid-range estimates of things and some best case ones in order to yield a lower bound figure.  If one takes a more reasonable R the final figure would be a lot larger than 50,000.

Let’s think again of the child, but let’s make the child a stroppy teenager:

Parent, “if you’re not careful you’ll break half the plates

Child replies, throwing the pile of plates to the floor, “no I’ll break them all.”

The teenager might be making a point, but is not invalidating the parent’s statement.

Maybe I am misinterpreting the intent behind this section, but given the lack of any explicit fact-check evidence elsewhere, it seems reasonable to treat this as at least part of the argument for the final verdict.

Final section – critique of claim

As noted, the second section “But will this happen?”, which one would assume is the actual critique and mustering of evidence, consists of three paragraphs: one quoting me, one quoting Kit Yates of Bath, and one which appears to be the real verdict.

The first paragraph is the original statement that appeared as ‘OUR VERDICT’ on the first page where I say that 50,000 deaths will almost certainly not occur in full because the government will be forced to take some sort of action once general Covid growth and death rates rise.  As noted if this is not ‘strong controls‘ what is?

The second paragraph reports Kit Yates as saying there are some mistakes in my model and is quoted as generously saying that he’s “not completely damning the work,”.  While grateful for his restraint, some minimal detail or evidence would be useful to assess his assertion.  On More or Less he questioned some of the values used and I’ve addressed that previously;  it is not clear whether this is what is meant by ‘mistakes’ here.  I don’t know if he gave any more information to Full Fact, but if he has I have not seen it and Full Fact have not reported it.

A tale of three verdicts

As noted the ‘verdict’ on the Full Fact home page is different from the ‘conclusion’ at the top of the main fact-check page, and in reality it appears the very final paragraph of the article is the real ‘verdict’.

Given this confusion about what is actually being checked, it is no wonder the argument itself is somewhat confused.

The final paragraph, the Full Fact verdict itself has three elements:

  • that UCU did not tweet the quote in context – as noted perhaps a little unfair in a tweeted quote that links to its source
  • that the 50,000 “figure comes from a model that is open to question” – well clearly there is question in Kit Yates’ quote, but this would have more force if it were backed by evidence.
  • that it is based on “predictions that will almost certainly not play out in the real world

The last of these is the main thrust of the ‘verdict’ quote on the Full Fact home page.  Indeed there is always a counterfactual element to any actionable prediction.  Clearly if the action is taken the prediction will change.  This is on the one hand deep philosophy, but also common sense.

The Imperial Covid model that prompted (albeit late) action by government in March gave a projection of between a quarter and a half million deaths within the year if the government continued a policy of herd immunity.  Clearly any reasonable government that believes this prediction will abandon herd immunity as a policy and indeed this appears to have prompted a radical change of heart.  Given this, one could have argued that the Imperial predictions “will almost certainly not play out in the real world“.  This is both entirely true and entirely specious.

The calculations in my paper and the quote tweeted by UCU say:

Without strong controls, the return to universities would cause a minimum of 50,000 deaths.”

That is a conditional statement.

Going back to the child; the reason the parent says ““if you’re not careful you’ll break half the plates“, is not as a prediction that half the plates will break, but an encouragement to the child to be careful so that the plates will not break.  If the child is careful and the plates are not broken, that does not invalidate the parent’s warning.

Last words

Finally I want to reiterate how much I appreciate the role of fact checking sites including Full Fact and also fact checking parts of other news sites as as BBC’s Reality Check; and I am sure the journalist here wanted to produce a factual article. However, in order to be effective they need to be reliable.  We are all, and journalists especially, aware that an argument needs to be persuasive (rhetoric), but for fact checking and indeed academia, arguments also need to be accurate and analytic (reason).

There are specific issues here and I am angered at some of the misleading aspects of this story because of the importance of the issues; there are literally lives at stake.

However, putting this aside, the story raises the challenge for me as to how we can design tools and methods to help both those working on fact checking sites and the academic community, to create and communicate clear and correct argument.

 

 

 

 

 

 

 

 

More or Less: will 50,000 people really die if the universities reopen?

Last Wednesday morning I had mail from a colleague to say that my paper on student bubble modelling had just been mentioned on Radio 4 ‘More or Less’ [BBC1].    This was because UCU (the University and Colleges Union) had tweeted the headline figure of 50,000 deaths from my paper “Impact of a small number of large bubbles on Covid-19 transmission within universities” [Dx1] after it had been reviewed by Jim Dickinson on Wonkhe [DW].  The issue is continuing to run: on Friday a SAGE report [SAGE] was published also highlighting the need for vigilance around University reopening and a Today interview with Dame Anne Johnson this morning [BBC2], who warned of “a ‘critical moment’ in the coronavirus pandemic, as students prepare to return to universities.

I’m very happy that these issues are being discussed widely; that is the most important thing.   Unfortunately I was never contacted by the programme before transmission, so I am writing this to fill in details and correct misunderstandings.

I should first note that the 50,000 figure was a conditional one:

without strong controls, the return to universities would cause a minimum of 50,000 deaths

The SAGE report [SAGE] avoids putting any sort of estimate on the impact.  I can understand why! Like climate change one of the clear lessons of the Covid crisis is how difficult it is to frame arguments involving  uncertainty and ranges of outcomes in ways that allow meaningful discussion but also avoid ‘Swiss cheese’ counter-arguments that seek the one set of options that all together might give rise to a wildly unlikely outcome.  Elsewhere I’ve written about some of the psychological reasons and human biases that make it hard to think clearly about such issues [Dx2].

The figure of 50,000 deaths at first appears sensationalist, but in fact the reason I used this as a headline figure was precisely because it was on the lower end of many scenarios where attempts to control spread between students fail.  This was explicitly a ‘best case worst case’ estimate: that is worst case for containment within campus and best case for everything else – emphasising the need for action to ensure that the former does not happen.

Do I really believe this figure?  Well in reality, of course, if there are major campus outbreaks local lockdowns or campus quarantine would come into place before the full level of community infection took hold.  If this reaction is fast enough this would limit wider community impact, although we would never know how much as many of the knock-on infections would be untraceable to the original cause. It is conditional – we can do things ahead of time to prevent it, or later to ameliorate the worst impacts.

However, it is a robust figure in terms of order of magnitude.  In a different blog I used minimal figures for small university outbreaks (5% of students) combined with lower end winter population R and this still gives to tens of thousands of knock-on community infections for every university [Dx3].

More or less?

Returning to “More or Less”, Dr Kit Yates, who was interviewed for the programme, quite rightly examined the assumptions behind the figure, exactly what I would would do myself.  However, I would imagine he had to do so quite quickly and so in the interview there was confusion between (i) the particular scenario that gives rise the the 50,000 figure and the general assumptions of the paper as a whole and (ii) the sensitivity of the figure to the particular values of various parameters in the scenario.

The last of these, the sensitivity, is most critical: some parameters make little difference to the eventual result and others make a huge difference.  Dr Yates suggested that some of the values (each of which have low sensitivity) could be on the high side but also one (the most sensitive) that is low.   If you adjust for all of these factors the community deaths figure ends up near 100,000 (see below).  As I noted, the 50,000 figure was towards the lower end of potential scenarios.

The modelling in my paper deliberately uses a wide range of values for various parameters reflecting uncertainty and the need to avoid reliance on particular assumptions about these.  It also uses three different modelling approaches, one mathematical and two computational in order to increase reliability.  That is, the aim is to minimise the sensitivity to particular assumptions by basing results on overall patterns in a variety of potential scenarios and modelling techniques.

The detailed models need some mathematical knowledge, but the calculations behind the 50,000 figure are straightforward:

Total mortality = number of students infected
                  x  knock-on growth factor due to general population R
                  x  general population mortality

So if you wish it is easy to plug in different estimates for each of these values and see for yourself how this impacts the final figure.  To calculate the ‘knock-on growth factor due to general population R’, see “More than R – how we underestimate the impact of Covid-19 infection” [Dx4], which explains the formula (R/(1-R)) and how it comes about.

The programme discussed several assumptions in the above calculation:

  1. Rate of growth within campus: R=3 and 3.5 days inter-infection period. –  These are not assumptions of the modelling paper as a whole, which only assumes rapid spread within student bubbles and no direct spread between bubbles.  However, these are the values used in the scenario that gives rise to the 50,000 figure, because they seemed the best accepted estimate at the time.  However, the calculations only depend on these being high enough to cause widespread outbreak across the student population.  Using more conservative figures of (student) R=2 and 5-6 day inter-infection period, which I believe Dr Yates would be happy with, still means all susceptible students get infected before the end of a term  The recent SAGE report [SAGE] describes models that have peak infection in November, consonant with these values. (see also addendum 2)
  2. Proportion of students infected. –  Again this is not an assumption but instead a consequence of the overall modelling in the paper.  My own initial expectation was that student outbreaks would limit at 60-70%, the herd immunity level, but it was only as the models ran that it became apparent that cross infections out to the wider population and then back ‘reseeded’ student growth because of clumpy social relationships.  However, this is only apparent at a more detailed reading, so it was not unreasonable for More or Less to think that this figure should be smaller.  Indeed in the later blog about the issue [Dx3] I use a very conservative 5% figure for student infections, but with a realistic winter population R and get a similar overall total.
  3. General population mortality rate of 1%. – In early days data for this ranged between 1% and 5% in different countries depending, it was believed, on the resilience of their health service and other factors. I chose the lowest figure.  However, recently there has been some discussion about whether the mortality figure is falling [MOH,LP,BPG].  Explanations include temporary effects (younger demographics of infections, summer conditions) and some that could be long term (better treatment, better testing, viral mutation).  This is still very speculative with suggestions this could now be closer to 07% or (very, very speculative) even around 0.5%.  Note too that in my calculations this is about the general population, not the student body itself where mortality is assumed to be negligible.
  4. General population R=0.7. – This is a very low figure as if the rest of society is in full lockdown and only the universities open. It is the ‘best case’ part of the ‘best case worst case’ scenario. The Academy of Medical Science report “Coronavirus: preparing for challenges this winter” in July [AMS] suggests winter figures of R=1.2 (low) 1.5 (mid) and 1.8 (high). In the modelling, which was done before this report, I used a range of R values between 0.7 and 3; that is including the current best estimates.  The modelling suggested that the worst effects in terms of excess deaths due to universities occurred for R in the low ‘ones’ that is precisely the expected winter figures.

In summary, let’s look at how the above affects the 50,000 figure:

  • 1.  Rate of growth within campus – The calculation is not sensitive to this and hence not affected at all.
  • 2 and 3.  Proportion of students infected and general population mortality rate – These have a linear effect on the final calculation (some sensitivity).  If we take a reduction of 0.7 for each (using the very speculative rather than the very, very speculative figure for reduced mortality), this halves the estimated impact.
  • 4. General population R. This an exponential factor and hence the final result is very sensitive to this. It was unreasonably low, but reasonable figures tend to lead to frighteningly high impacts.  So let’s still use a very conservative figure of 0.9 (light lockdown), which multiplies the total by just under 4 (9/2.3).

The overall result of this is 100,000 rather than 50,000 deaths.

In the end you can play with the figures, and, unless you pull all of the estimates to their lowest credible figure, you will get results that are in the same range or a lot higher.

If you are the sort of person who bets on an accumulator at the Grand National, then maybe you are happy to assume everything will be the best possible outcome.

Personally, I am not a betting man.

 

Addendum 1: Key factors in assessing modelling assumptions and sensitivity

More or Less was absolutely right to question assumptions, but this is just one of a number of issues that are all critical to consider when assessing mathematical or computational modelling:

  • assumptions – values, processes, etc, implicitly or explicitly taken as given
  • sensitivity – how reliant a particular result is on the values used to create it
  • scenarios – particular sets of values that give rise to a result
  • purpose – what you are trying to achieve

I’ve mentioned the first three of these in the discussion above. However, understanding the purpose of a model is also critical particularly when so many factors are uncertain.  Sometimes a prediction has to be very accurate, for example the time when a Mars exploration rocket ‘missed’ because of a very small error in calculations.

For the work described here my own purpose was: (i) to assess how effective student bubbles need to be, a comparative judgement and (ii) to assess whether it matters or not, that is an order of magnitude judgement.    The 50K figure was part of (ii).  If this figure had been in the 10s or 100s even it could be seen to be fairly minor compared with the overall Covid picture, but 10,000, 50,000 or 100,000 are all bad enough to be worth worrying about.  For this purpose fine details are not important, but being broadly robust is.

 

Addendum 2:  Early Covid growth in the UK

The scenario used to calculate the 50K figure used the precise values of R=3 and a 3.5 day inter-infection period, which means that cases can increase by 10 times each week..  As noted the results are not sensitive to these figures and much smaller values still lead the the same overall answer.

The main reason for using this scenario is that it felt relatively conservative to assume that students post lockdown might have rates similar to overall population before awareness of Covid precautions – they would be more careful in terms of their overall hygiene, but would also have the higher risk social situations associated with being a student.

I was a little surprised therefore that, on ‘More or Less’, Kit Yates suggested that this was an unreasonably high figure because the week-on-week growth had never been more than 5 times.  I did wonder whether I had misremembered the 10x figure, from the early days of the crisis unfolding in February and March.

In fact, having rechecked the figures, they are as I remember.  I’ll refer to the data and graphs on the Wikipedia page for UK Covid data.  These use the official UK government data, but are visualised better than on Gov.UK.

UK Cases:  https://en.wikipedia.org/wiki/COVID-19_pandemic_in_the_United_Kingdom#New_cases_by_week_reported

I’m focusing on the early days of both sets of data.  Note that both new cases and deaths ‘lag’ behind actual infections, hence the peaks after lockdown had been imposed. New cases at that point typically meant people showing serious enough symptoms to be admitted to hospital, so lags from infection by say a week or more. Deaths lag by around 2-3 weeks (indeed not included after 28 days to avoid over-counting).

The two data sets are quite similar during the first month or so of the crisis as at that point testing was only being done for very severe cases that were being identified as potential Covid. So, Iet’s just look at the death figures (most reliable) in detail for the first few weeks until the lockdown kicks in and the numbers peek.

week deaths growth (rounded)
29 Feb — 6 March 1
7–13 March 8 x8
14–20 March 181 x22
21–27 March 978 x5
28 March — 3 April 3346 x3.5
4–10 April 6295 x2

Note how there is an initial very fast growth, followed by pre-lockdown slowing as people became aware of the virus and started to take additional voluntary precautions, and then peeking due to lockdown.  The numbers for initial fast phase are small, but this pattern reflects the early stages in Wuhan with initial, doubling approximately every two days before the public became aware of the virus, followed by slow down to around 3 day doubling followed by lockdown.

Indeed in the early stages of the pandemic it was common to see country-vs-country graphs of early growth with straight lines for 2 and 3 day doubling drawn on log-log axes. Countries varied on where they started on this graph, but typically lay between the two lines.  The UK effectively started at the higher end and rapidly dropped to the lower one, before more dramatic reduction post-lockdown.

It may be that Kit recalled the x5 figure (3 day doubling) is it was the figure once the case numbers became larger and hence more reliable.  However, there is also an additional reason, which I think might be why early growth was often underestimated.  In some of the first countries infected outside China their initial growth rate was closer to the 3 day doubling line. However this was before community infection and when cases were driven by international travellers from China.  These early international growths reflected post-public-precautions, but pre-lockdown growth rates in China, not community transmission within the relevant countries.

This last point is educated guesswork, and the only reason I am aware of it is because early on a colleague asked me to look at data as he thought China might be underreporting cases due to the drop in growth rate there.  The international figures were the way it was possible to confirm the overall growth figures in China were reasonably accurate.

References

[AMS] Preparing for a challenging winter 2020-21. The Academy of Medical Sciences. 14th July 2020. https://acmedsci.ac.uk/policy/policy-projects/coronavirus-preparing-for-challenges-this-winter

[BBC1] Schools and coronavirus, test and trace, maths and reality. More or Less, BBC Radio 4. 2nd September 2020.  https://www.bbc.co.uk/programmes/m000m5j9

[BBC2] Coronavirus: ‘Critical moment’ as students return to university.  BBC News.  5 September 2020.  https://www.bbc.co.uk/news/uk-54040421

[BPG] Are we underestimating seroprevalence of SARS-CoV-2? Burgess Stephen, Ponsford Mark J, Gill Dipender. BMJ 2020; 370 :m3364  https://www.bmj.com/content/370/bmj.m3364

[DW] Would student social bubbles cut deaths from Covid-19?  Jim Dickinson on Wonkhe.  28 July 2020.  https://wonkhe.com/wonk-corner/would-student-social-bubbles-cut-deaths-from-covid-19/

[DW1] Could higher education ruin the UK’s Christmas?  Jim Dickinson on Wonkhe.  4 Sept 2020.  https://wonkhe.com/blogs/could-higher-education-ruin-the-uks-christmas/

[Dx1] Working paper: Covid-19 – Impact of a small number of large bubbles on University return. Working Paper, Alan Dix. created 10 July 2020. arXiv:2008.08147 stable version at arXiv |additional information

[Dx2] Why pandemics and climate change are hard to understand, and can we help?  Alan Dix. North Lab Talks, 22nd April 2020 and Why It Matters, 30 April 2020.  http://alandix.com/academic/talks/Covid-April-2020/

[Dx3] Covid-19, the impact of university return.  Alan Dix. 9th August 2020. https://alandix.com/blog/2020/08/09/covid-19-the-impact-of-university-return/

[Dx4] More than R – how we underestimate the impact of Covid-19 infection. Alan Dix.  2nd August 2020. https://alandix.com/blog/2020/08/02/more-than-r-how-we-underestimate-the-impact-of-covid-19-infection/

[LP] Why are US coronavirus deaths going down as covid-19 cases soar? Michael Le Page. New Scientist.  14 July 2020. https://www.newscientist.com/article/2248813-why-are-us-coronavirus-deaths-going-down-as-covid-19-cases-soar/

[MOH] Declining death rate from COVID-19 in hospitals in England
Mahon J, Oke J, Heneghan C.. The Centre for Evidence-Based Medicine. June 24, 2020. https://www.cebm.net/covid-19/declining-death-rate-from-covid-19-in-hospitals-in-england/

[SAGEPrinciples for managing SARS-CoV-2 transmission associated with higher education, 3 September 2020.  Task and Finish Group on Higher Education/Further Education. Scientific Advisory Group for Emergencies. 4 September 2020. https://www.gov.uk/government/publications/principles-for-managing-sars-cov-2-transmission-associated-with-higher-education-3-september-2020