Posted on


As many parts of the United Kingdom begin to lift lockdown restrictions, it is crucial for people to be able to estimate the risks involved with resuming activities, particularly those involving mixing with other people, and make decisions based upon the available evidence. This COVID-19 Risk Assessment Planning tool can be used to explore the risk that at least one person at an event of a certain size is currently infected with COVID-19, given a certain number of circulating infections in the specified region.

These risk calculations tell you only how likely it is that at least one person at any event of a given size is infectious. This is not the same as the risk of any person being exposed or infected with COVID-19 at the event.

The number of circulating cases (people who are currently infectious) is defined as cases reported in the past ten (10) days. The choice of duration is consistent with the isolation period imposed by the UK government on people returning from foreign countries and the isolation instructions given to people who test positive for Coronavirus.  Under reporting is corrected for by multiplying by an ascertainment bias. Based on UK Government Coronavirus (COVID-19) Infection Survey pilot data it is evident that there are many more cases in circulation in the community than those reported through testing. For example, for the week 4th –  10th September the above survey estimated that there were between 46,900 and 75,200 cases in the community, whereas the official test results stats for that period reports 19,193 cases. Using these numbers it can be seen that the total number of estimated cases in the community is therefore somewhere between 2.44 and 3.92 greater than the number of recorded cases.  In addition to this data issued today is being constantly revised, primarily due to delays in the reporting loop. For example the data for England from late August (issued in late August) shows a total of 15,346 cases for the period 2nd August to 21st August; later data (from late September) records a total of 18,092 cases for the same period – an increase of just over 14%. Cases may be under-reported due to a number of reasons; testing shortages, asymptomatic “silent spreaders”, lags in informing people of their test results and reporting lags. There is also significant evidence of a shortfall in testing capacity (Sept 2020) with a large number of people not being able to get access to a test when required. It is therefore not unreasonable to assume that there could be somewhere between 2 and 5 times the number of active cases in the community when compared to the reported number of cases. The population data is also somewhat imperfect; the last census was performed in 2011 and values available from the UK Government are always identified as estimates. Therefore the graph includes 3 horizontal lines; one based on the current number of reported cases in the last 10 days (), a 2nd line () with an ascertainment bias of 2 and a 3rd line () with an ascertainment bias of 5.

The tool generates 2 different types of graphs:

a) Initial load view

This graph of event attendees (horizontal x-axis) against risk of an infected attendee (vertical y-axis) – please note that the horizontal axis is on a logarithmic scale. The graph is overlaid with curved lines of level of reported incidences from the latest real-time COVID-19 surveillance data – reported cases over a 10 day period (circle ), an ascertainment bias of 2 times the current incidence (diamond ), and an ascertainment bias of 5 times the current incidence (triangle ). From the graph it can be seen that as the size of the event increases and the number of infections in the community increases then, unsurprisingly, the %chance of an infected person attending that event also increases. You can mouse over the graph to see the exact data relating to any position on the graph including the level of infection if is below 250: 1.

and b) The alternative view based upon the Georgia Institute of Technology website graphs.

Please note that both axes are on a logarithmic scale. The diagonal lines divide the chart into risk levels. For example, all scenarios between the green and purple lines involve a 20 – 50% risk that someone with COVID-19 is present at the event. The grey region in the bottom left hand corner indicates scenarios with a less than 1% chance that someone with COVID-19 is present. The area in the top right hand corner, above the red line, is where the probability exceeds 90%. The graph provides some exact values for a few pre-set scenarios highlighted by the markers where the horizontal infected cases lines cross the vertical lines defined by number of attendees – highlighted by the red circular (), diamond () and triangular () shaped markers. Mouse over the markers to see the exact values of probability for any combination of event size and number of cases of infection.

The horizontal dotted lines with indicated risk estimates are based on the latest real-time COVID-19 surveillance data. They represent the current reported incidences – reported cases over a 10 day period (circle ), an ascertainment bias of 2 times the current incidence (diamond ), and an ascertainment bias of 5 times the current incidence (triangle ). The original model developed by the Georgia Institute of Technology used ascertainment biases of 5 and 10 for the US but as the number of tests in the UK has increased it is obvious that these biases are not applicable to the UK. Data from the Covid Symptom Study App with currently over 4.4m contributors (early Dec 2020) produces similar estimates to the UK Government study mentioned earlier, suggests that much a lower ascertainment bias of between 2 and 5 is probably closer to reality for the UK. However, the Covid Symptom Study App only mentions symptomatic cases and a study reported by Imperial College London from June 2020 suggests that 40% of infections are asymptomatic, which possibly points to a slightly higher upper bias level. A report from the Nuffield Trust dated 09-Dec-20 indicates similar numbers to this tool for the middle weeks in November 2020 although they are using weekly data and not accumulated data over 10 day periods. That does not mean that the results should be factored by 7 / 10 because people are infected and contagious for 10 days i.e. longer than the sampling period and they will roll over from one sampling period to the next until they are no longer considered infected. Random checks on the data provided by the C-19 by ZOE app (throughout 2021) show that the ascertainment bias has regularly sat in the range 4 to 5.

Note that the colours of the lines and markers are now (Oct-21) different on the website to ensure that they can be easily differentiated by people with colour blindness issues. The circles are now blue (), the diamonds () yellow and triangles () red as recommended by Paul Tol’s Notes

Notes on Usage and Interpretation:

All of the calculations are necessarily estimates, based on imperfect data and it is not possible to determine the probability that someone at an event will get infected. It’s important to remember that a certain amount of chance is involved in these outcomes. Large event planners are encouraged to exercise caution in the coming months, especially given the potential for one infected person to transmit the virus to many others in one super-spreading event (Cheltenham Gold Cup, European Champions League showpiece between Liverpool and Atletico Madrid, England v Wales rugby international at Twickenham on 7 March, Stereophonics concerts in Cardiff, the Crufts Dog Show, the All England Badminton Tournament in Birmingham, and many others) given that the virus has a reported transmission rate of between 0.5 and 6 – see Wikipedia: Basic Reproduction Number .

As a final note, there is a high risk of being exposed to COVID-19 in many parts of the UK right now (late 2021, just as in late 2020 / early 2021). You can reduce your risk of getting infected or infecting someone else by practicing social distancing, wearing masks when out of your home, hand-washing regularly, and staying home when you feel sick. Learn more on how to minimize your individual risk at NHS: How to avoid spreading coronavirus to people you live with. There is some controversy over mask wearing but these reports from reliable sources indicate otherwise: The Guardian Newspaper website , www.nature.com , USA Center for Disease Control and www.nature.com

The Maths:

The following is an anglicised version of text taken from the Georgia Institute of Technology website to make the numbers and events more relevant to the people living in the United Kingdom.

What is the chance that one person at an event will be infected with COVID-19? To answer this kind of question, the tool attacks the problem by calculating the opposite; for example, taking the case of a football match – European Champions League showpiece between Liverpool and Atletico Madrid, where 54,000 people attended in late March 2020, what is the chance that none of the 54,000 attendees were infected?

Let’s start by thinking about just one of them. If, for example, 15,000 of the 66 million people in the United Kingdom are sick, then the odds against being sick are 4400:1 (66,000,000/15,000). This equates to each person having a 99.977% chance of being disease-free. In betting terms, the odds of 4,400:1 in your favour sounds pretty good from an individual perspective, however, the collective risk is very different. In the scenario of 25,000 fans attending a football match the probability that all 25,000 attendees would have entered the stadium disease-free is like placing 25,000 bets each at nearly certain odds. Sure, you’ll win most of the bets, but the probability that you will win every single one of those bets is extremely low. To calculate it, we multiply the winning probability (1 – 1/4400) by itself 25,000 times and find that there is approximately a 0.341% chance that you win every time. In other words, the chances that one or more attendees would have arrived at the event infected with SARS-CoV-2 is 99.659% ; (100 – 0.341) . Increase the size of the crowd to 54,000 and the chances increase to 99.9995%.

Increase the number of sick people in the country to 20,000 changes the individual odds to 3300:1  and the collective chances of an infective person attending with a crowd of 54,000 increase to 99.999992%.

The Maths (part 2):

For those of you more interested in the Maths the following equations apply:

R = 1 – (1 – PI)n……………………………..(1)


R = risk

PI = probability of infection = I / popn

I = incidence level of infection, number of infected people

n = number of attendees at an event

popn = population of country or region of interest

In order to plot the graph on the home page it is necessary to transpose eq.(1) to derive an equation for the incidence level in terms of risk.

R = 1 – (1 – PI)n

(1 – PI)n = 1 – R

1 – PI = (1 – R) 1/n

1 – (1 – R) 1/n = PI

As PI = I / popn

I = ( 1 – (1 – R) 1/n ) x popn……………………(2)

Using these equations with different risk (R) values (0.01 to 0.90) – eq.(2), it is possible to plot the diagonal risk lines seen on the above graph and also derive the risk levels associated with the indicated Incidence levels at different ascertainment values (2, 3, 5, 10) using eq.(1).

Website History (change log):

Apart from updating the database nearly everyday the website has undergone the following changes and developments since its launch just prior to Christmas 2020 (Note that from the beginning of Mar-22 Covid-19 case data is no longer available from the UK Gov website at weekends) .

  • Jan-21: Introduction of plot of %chance against number of event attendees – the graph with the curved lines – and making it the initial graph displayed instead of the plot with diagonal and horizontal lines mirroring that from the Georgia Institute of Technology, website
  • Feb-21: Introduction of Timeline plot
  • Feb-21: Introduction of My Event page
  • Jun-21: Introduction of postcode search to location identification
  • Aug-21: Addition of top 10 cases locations to Home page and Timeline page
  • Sep-21: Addition of display of time and date of last database update
  • Sep-21: Addition of ability to email website link for a specific location to somebody
  • Oct-21: Change of palette of colours used on graphs to one that is colour blind friendly – see Paul Tol’s Notes using his High-contrast qualitative colour scheme.
  • Dec-21: Addition of lowest 10 cases locations to Home page and Timeline page
  • Dec-21: Addition of extended timeline plot to Timeline page showing changes in risk since May-20 overlaid with cases per 100,000
  • Dec-21: Addition of “1 Week ago” button to the home page due to the virulence of the Omnicron variant
  • Dec-21: RH scale on Timeline plot doubled from 2,400 to 4,800 to allow surge of Omnicron cases to be correctly displayed on graph. (and increased again to 6000 – mid Jan-22 – to cope with the surge in Omnicron cases in NI – Derry City and Strabane).
  • Jan-22: “1 week ago”…”2 months ago” buttons on home page replaced with 2 month slider to allow people to more accurately see surge of Omnicron cases over the Christmas / New Year period 2021/22.