Stephen Burch's Birding & Dragonfly Website
Covid-19 total deaths and fitted curves for different countries
Monthly summary (16 January 2022)
For the current Covid-19 pandemic, death statistics by country are widely available, with the Worldometers website being one of the most convenient sources.
There have also been many plots given in the media, including in the Guardian, the FT on-line, and elsewhere, of death rates from day of first death compared between countries (accredited to the John Hopkins University).
I thought I'd see if I could reproduce these for myself, using a different selection of countries. This was easy enough to do using Excel and the Worldometers data.
Deaths per head of population
As politicians were keen to highlight, it is difficult to make valid comparisons between countries, but an approximate way of doing this is to compare the death rates per head of population, which I show in the graph below. These are the figures for total deaths to date divided by the population for each country.
The curves above show that even now the totals are still comparatively small as a percentage of each country's population, with the highest (Brazil) now exceeding 0.29% (which corresponds to about 1 in 340). To put these figures into perspective, in the UK (and I presume elsewhere) the percentage of the population dying each year is just under 1% or 620,000 (2018 figure) in its population of 67 million.
As a percentage of its population, having the lead the field for some time, the UK is now in fourth place having been "overtaken" by Italy, Brazil, and the USA.
For further comments, see the sections below on each country in turn.
In the early stages of the pandemic, many sources were showing plots of log(total deaths) vs linear time as the initial expected exponential growth phase then appears linear. However although we have seen many of these plots in the media, forward projections of these figures are much rarer.
To project forward, an assumption is needed about how the death rates will change with time. There is a whole science of pandemic modelling about which I know very little. What I have seen involves a complex approach based on a large number of parameters and multiple differential equations. As I have no idea what assumptions are used in these models and how they work in practice, I've looked at a much simpler approach based entirely on the available data to date for death rates. I have ignored all the information on number of cases on the basis that these are entirely dependent on the amount of testing done, which at present is very limited in the UK at least.
Basis of approach
My approach is as
follows. It seems to me that a plausible
formula for the total deaths to a particular date is as follows:
where t is the time since first death in days. For small t this curve is linear, which is equivalent to the initial expected exponential increase. For large t, this curve eventually saturates at a value = D, where the total number of deaths in the pandemic = exp(D). The unknown parameters D and a are found by least squares fitting to the reported total deaths to date.
A slight generalisation of this formula involves three unknown parameters instead of two:
where t0 allows an adjustment to the (uncertain) date of first death. The constant t0 also allows further waves to be added to the deaths from the first wave, and displaced in time relative to it.
To fit the above curves to the available data for a country, I compare the modelled and actual reported new deaths for each day, and sum the square of the differences between them. The Solver in Excel then allows the sum of the squares of the differences to be minimised by changing the unknown parameters D, a , t0 in the above formulae.
In the early stages of the pandemic in each country, the numbers of reported deaths tend to rise exponentially with time (which appears as a straight line on a logarithmic plot). In this case, it is impossible to derive with any confidence a value for the "curvature" parameter, a. Without that, any projections of the numbers into the future are meaningless. Even as the epidemic progresses, small changes in the value of the parameter a can have a huge effect on the modelled prediction for the total number of people that will die.
It is important to note that I am not claiming any accuracy for these predictions. This approach is just one way of estimating future trends based on available data.
However, for the first wave in China, a study, given at the bottom of this page, showed that this approach was giving estimates of total deaths within a factor two of the final total, even at the early stage of the pandemic. This has not necessarily been the case with all the other countries featured here. Nevertheless, it is often remarkable how closely the observed deaths follow the fitted curves, which have only 3 adjustable parameters.
I first give here the results for the UK, followed by those for other selected countries. In all cases I have used the Worldometer data for total deaths to date, as a function of time.
For the UK, these are the daily figures from the DHSC that appear widely in the media. In these figures, for a death to be attributed to Covid-19, there must now have been a positive Covid-19 test within the last 28 days. These figures hence do not include deaths from Covid-19 where a test had not been performed, which may have occurred most often in care homes and the community. Conversely, any death, from whatever cause, within 28 days of a positive Covid-19 test is now included, including presumably obviously unrelated deaths such as car accidents.
The plot below shows the Worldometer data for daily deaths. It also shows the results of fitting to each wave curves of the form given in the second equation given above. As the daily figures show considerable fluctuations, with lower values at weekends, I now am now following many others in showing values averaged over a week (i.e. a rolling average).
In the plot below I show the results from models for different waves. The total is the sum of all waves and is a remarkably close fit to the actual values.
In the summer of 2021, the increases in cases, due to the new Delta variant were followed, from about day 480, by a modest rise in daily deaths. Since around day 570 there have been some fluctuations around the fitted curve, but the overall trend still follows the curve quite closely. It is notable that the predicted total of over 30,000 deaths in this wave is only slightly less than those in the other peaks, which had much higher maximum daily death rates but were also much less prolonged.
The new Omicron variant is now causing a steep increase in the daily deaths since about 660. The fitted curve is very uncertain in the current early stages of this wave, but it suggests it will be much shorter lived than the broad peak caused by Delta. The fitted curve suggests a modest total number of Omicron related deaths of less than 10,000, but these are in addition to the much longer lived Delta fuelled wave.
The plot below shows the daily deaths for Spain. After a strong first wave, Spain then experienced a smaller more gradual second wave, which was closely followed by a third wave (which peaked around day 340). The decline in the third wave deaths was briefly reversed to form a very short lived fourth peak around day 390 (not modelled). There was then another decline followed by a slow and moderate increase which peaked around day 550 (again not modelled) and then subsided.
Spain is now also experiencing a further rise in deaths due to the Omicron variant, not currently modelled.
The plot below shows the daily deaths for Italy together with the fitted curve. The curve is a pretty good fit to the actual deaths. Back in March, the pandemic gripped Italy a few weeks before other countries, and the deaths rose very sharply before falling back quite quickly, possibly due to the strict lockdown measures imposed. Deaths then stayed low, until about day 220 when an increase was first apparent.
The daily deaths in this second wave reached a very similar peak to the maximum in the first wave and a decline then began. The model 2 fitted to these values gives a total number of deaths at about 50,000 which is about 40% higher than the first wave total. Around day 305 the decrease in the deaths abruptly halted and a small further increase occurred which has now peaked and started dropping off again. To account for this, I have now added a third wave into the modelling. For Italy this third wave is clearly evident but is weaker than in other countries such as France and the UK. The sum of all three modelled waves gives an estimate of 100,000 for the total deaths.
A fourth wave started around day 370 which is included in the modelling. The peak on this wave occurred around day 410. A decline then took place and deaths reached a low level at around day 510.
There is now a new strong wave, which started around day 600. It is presumably caused by the Omicron variant and the fitted curve is showing an unconstrained almost exponential rise with no plausible estimate yet of the total number of deaths likely.
The modelling now has four curves which gave a good fit to the daily deaths up to about day 520. Thereafter the relatively small upturn which started around day 520 has not been modelled. After a small decline, the rates are now rising again more steeply, due to Omicron, in a very similar way to neighbouring Italy. I have not yet modelled this latest wave which is the sixth for France.
As can be seen below, the USA numbers now show four distinct waves of the pandemic. After the first wave, and having been in decline for some time, the daily figures started increasing again around day 120. A second wave then occurred which appeared to peak around day 170. A slow second decline then took place before the latest upturn in deaths from about day 240.
I have modelled the USA figures using the sum of four different curves fitted to the daily death figures. The first curve is intended to match (approximately) the initial stages of the epidemic while the second curve is to account for the rise in deaths after around day 120. The third wave modelling is to account for the large increase in deaths from about day 240 onwards, with the fourth wave starting around day 500.
Since about day 500, the daily deaths which were at a low level increased again rapidly, following a large upturn in case numbers presumably due to the Delta variant. Of the countries featured here, this was largest increase due to the Delta variant. Is being attributed mostly to infections in those who have not been vaccinated. the daily deaths peaked at about 2000/day and were in decline. The total deaths in this wave is now modelled to be about 240,000.
From about day 640, a further surge in deaths, this time due to Omicron started. As this is still in its early stages, I have not yet modelled it.
Unlike all the other countries featured on this page, Sweden has not had a major lockdown. Hence a comparison with other countries that have much higher levels of restriction in interesting. Sweden has a significantly smaller population (about 10 million) and a lower population density than any of the other countries given here. Sweden has experienced the familiar pattern of a first wave starting in March, followed by a decline and then an autumn 2020 second wave.
The modelling of this second wave is now shown below within the total modelled deaths curve (I have not shown the separate model 1 and model 2 components). The second wave peaked around day 300, and since then there has been a strong decline, with a couple of minor upturns. Since around day 640 there has been another upturn, presumably due to Omicron (not yet modelled).
In Germany, the approach was completely different from Sweden with a huge amount of testing and follow-up of those infected, combined with a major lockdown. At present, the fitted curve has a final total death toll of about 9,000, in stark contrast to the UK, Italy, Spain and France. The fitted curve shows the peak in the daily numbers was reached on about day 35 and then tailing off well, despite some easing of lockdown rules.
The second wave deaths have dwarfed those in the first wave, although of course the first wave peak in Germany was comparatively very low. The second wave deaths peaked around day 300 and a strong decline followed. This then slowed and abruptly rose somewhat around day 400. This mini third wave (not modelled) has now also subsided and deaths were at a low level until about day 520. Since then there has been an erratic upturn, which is now rising sharply, following a very rapid increase in cases.
This latest wave, which may be mainly due to Delta variant, peaked around day 640 with the modelled total deaths in this wave now being around 30,000. Another wave may occur soon due to Omicron.
For Brazil, it is clear the the modelled curves do not fit the actual deaths as well as in most of the other countries shown here. In the first wave, following the initial rise, there was an unusual plateau, with almost constant values from day 80 to about day 160. There was then a progressive decrease until around day 240. In the days since then, a second wave has developed despite it not being autumn in this southern hemisphere country.
The modelling is now based on two separate curves for the first and second waves respectively. The actual second wave deaths increased strongly for many weeks but then abruptly declined. The data for this country show large fluctuations and it is unclear how reliable the figures are.
Very recently the deaths have shown a small upturn, possibly due to either of the Delta or Omicron variants.
For China the present phase of the pandemic is apparently over. It can therefore be used as useful test case for the modelling approach described above. The plot below the available data for China up to 50 days since the first death. Note immediately how closely the 2-parameter equation given above fits to the measurements, over almost the entire course of the pandemic. In this country at least, the equation seems to be more than plausible - it provides a good fit to the entire pandemic.
I have then tested how accurately the modelling approach predicts the final death toll, based on different amounts of the available data. The results are shown below. This graph shows how the predicted total deaths vary depending on the date at which the modelling is performed. All predictions were within a factor of two of the final value. This is unlike my experience with any of the other countries modelled to date. In all these cases, the predictions for the final death toll have been varying by large amounts, indicating the approach is not providing meaningful values.
It is however surely a coincidence that the prediction, performed on only the first six days of data (I excluded the values below 4 days from first death), is almost spot on the final figure.
|© All plots copyright Stephen Burch|