Stephen Burch's Birding & Dragonfly Website
Covid-19 total deaths and fitted curves for different countries
Last updated: 29 November
For some of the Western European countries, the first signs of turn downs in the second wave deaths, first reported last week, has continued. By this I mean the fitted second wave curves are no longer showing purely exponential increases and consequently the predicted total numbers of deaths in these second waves are no longer implausibly high. Unconstrained exponential increases are now only being shown by the USA's third wave and by Germany's second wave.
For the current Covid-19 pandemic, death statistics by country are widely available, with the Worldometers website being one of the most convenient sources.
There have also been many plots given in the media, including in the Guardian, the FT on-line, and elsewhere, of death rates from day of first death compared between countries (accredited to the John Hopkins University ).
I thought I'd see if I could reproduce these for myself, using a different selection of countries. This was easy enough to do using Excel and the Worldometers data.
As politicians are keen to highlight, it is difficult to make valid comparisons between countries, but it seems that an approximate way of doing this is to compare the death rates per head of population, which I show in the graph below. These are the figures for total deaths to date divided by the population for each country.
The curves above show that even now the totals are comparatively small, as a percentage of each country's population, with the highest approaching 0.1% (which corresponds to about 1 in 1000). To put these figures into perspective, in the UK the percentage of the population dying each year is just under 1% or 620,000 (2018 figure) in its population of 67 M.
As a percentage of population, the total deaths in the UK are now below only Spain and Italy, and are higher than those in the other countries shown here. Regrettably the recent increase in cases is now leading to a significant upturn in deaths as well.
Second waves are now evident for all the European countries shown here. Remarkably however, and almost simultaneously in all countries, in the last two weeks the rates of increases in the deaths in these second waves have stopped rising exponentially and are showing signs of leveling off. In the last week this trend has been confirmed.
In Spain, the death rate per head of population continues to be the highest of those countries featured here. After the first wave, daily deaths were low for a considerable time, but then the recent increases in new cases fed through to the death rate which is now increasing significantly as well.
In Italy, there has been the familiar pattern of a first wave in the spring, followed by a steep reduction in the summer. The second wave is now well developed and may be starting to peak.
The USA has by far the highest number of Covid-19 deaths, but when expressed as a percentage of its (large) population, it currently amounts to just under 0.08%. This is continuing to rise appreciably (more of which, see below) and the percentage deaths in the USA are now higher than any of the other countries shown here apart from Spain, Brazil and just this week, the UK. Uniquely, the USA is now in the grip of an accelerating third wave of deaths.
Sweden, which hasn't had a strict lockdown, is currently showing a total of 0.06%, and its deaths had leveled off. However here there are now signs of the second wave sweeping through most other European countries.
India has a huge population so although the deaths have now exceeded 100,000 (the third highest in the world, behind the USA and Brazil), as a percentage of the population the figure is still low (<0.01%).
Brazil now has the second highest number of total deaths in the world and is the second highest in terms of a percentage of its population, just below Spain.
In the early stages of the pandemic, many sources were showing plots of log(total deaths) vs linear time as the initial expected exponential growth phase then appears linear. However although we have seen many of these plots in the media, forward projections of these figures are much rarer.
To project forward, an assumption is needed about how the death rates will change with time. There is a whole science of pandemic modelling about which I know very little. What I have seen involves a complex approach based on a large number of parameters and multiple differential equations. As I have no idea what assumptions are used in these models and how they work in practice, I've looked at a much simpler approach based entirely on the available data to date for death rates. I have ignored all the information on number of cases on the basis that these are entirely dependent on the amount of testing done, which at present is very limited in the UK at least.
Basis of approach
My approach is as
follows. It seems to me that a plausible
formula for the total deaths to a particular date is as follows:
where t is the time since first death in days. For small t this curve is linear, which is equivalent to the initial expected exponential increase. For large t, this curve eventually saturates at a value = D, where the total number of deaths in the pandemic = exp(D). The unknown parameters D and a are found by least squares fitting to the reported total deaths to date.
A slight generalisation of this formula involves three unknown parameters instead of two:
where t0 allows an adjustment to the (uncertain) date of first death.
After having tried these two equations, I have found they do not seem to fit the reported death tolls that closely when the numbers start to drop-off, probably due to the belated effects of the lockdowns that have been in force in many countries for sometime now. Hence I have tried a further elaboration of the above which adds a t squared parameter as well:
I am now using whichever of the above formulae appear to best fit the reported total deaths for each country.
There is a further complication in that there are two different ways of fitting the above curves to the available data for a country. The first is to look at the differences between the modelled and reported total deaths to date, and then square and sum these differences. The Solver in Excel then allows the sum of the squares of the differences to be minimised by changing the unknown parameters D, a , t0 and b (if used) in the above formulae. Alternatively, the modelled and actual reported new deaths for each day can be compared, and again the sum of the square of the differences found.
The weakness of this modelling approach is that these two methods of fitting the same equations to the data produce significantly different results. I think this is because the first method, which uses the logarithmic approach gives almost equal weight to the early small numbers of deaths and the later higher numbers. However the second method is looking at the (linear) deaths per day - this gives much more weight to the larger values occurring later in the epidemic.
Also in the early stages of the epidemic the numbers of reported deaths tend to rise exponentially with time (which appears as a straight line on a logarithmic plot). In this case, it is impossible to derive with any confidence a value for the "curvature" parameter, a. Without that, any projections of the numbers into the future are very uncertain. Even as the epidemic progresses, small changes in the value of the parameter a can have a huge effect on the modelled number for the total number of people that will die in the epidemic.
It is important to note that I am not claiming any accuracy for these predictions. This approach is just one way of estimating future trends based on available data.
However, in China it appears the present phase of the pandemic is over. At the bottom of this page there is a study which shows that this approach was giving estimates of total deaths within a factor two of the final total, even at the early stage of the pandemic.
I first give here the results for the UK, followed by those for other selected countries. In all cases I have used the Worldometer data for total deaths to date, as a function of time.
For the UK, these are the daily figures from the DHSC that appear widely in the media. In these figures, for a death to be attributed to Covid-19, there must now have been a positive Covid-19 test within the last 28 days. These figures hence do not include deaths from Covid-19 where a test had not been performed, which may have occurred most often in care homes and the community. Conversely, any death, from whatever cause, within 28 days of a positive Covid-19 test is now included, including presumably obviously unrelated deaths such as car accidents.
The plot below shows the Worldometer data for the UK in terms of deaths to date (left logarithmic axis) and daily deaths (right linear axis). It also shows the results of fitting the second equation given above to the daily death values. As the daily figures show considerable fluctuations, with lower values at weekends, I now am now following many others in showing values averaged over the previous week (i.e. a rolling average).
On the plot below, also shown are the separate weekly registered death figures reported by the ONS (England) added to the numbers from the corresponding separate bodies for Scotland and Northern Ireland. These are all deaths where Covid-19 is mentioned on the death certificate, and are counted by date of registration of the death. These numbers are higher than those reported by the DHSC, which only counts those with positive Covid-19 tests. The registered death figures are also accompanied by information on all deaths and how they compare with the long-term average. These show a significant number of unaccounted for excess deaths where Covid-19 is not on the death certificate. These could be from other causes, e.g. heart attacks, cancers and strokes, which may be have increased due to the reallocation of NHS resources to handling the Covid-19, to the detriment of other forms of care. Alternatively, or additionally, there may well have been some deaths caused by Covid-19 but not mentioned on the death certificate. On 14 June, there had been about 65,000 excess deaths in the UK since the start of March 2020 - substantially higher than the 42,000 Covid-19 deaths reported by DHSC on that date. Since then the total number of deaths has been slightly lower than the 5-year average, so the excess deaths are declining slightly. This all goes to highlight, even today, the difficulty in being sure of how many people have died during a pandemic.
Note added 9 August: I have now stopped updating the ONS figures on the plot below, as they take a considerable time to work out. This is because the information, all in different formats, needs to be downloaded separately for England, Scotland and Northern Ireland and then added together!
In the UK, the first wave was met by a national lockdown that appeared to be sufficient to bring down cases and death rates after about 2 months. These measures were then eased during the summer, only for a second wave to emerge. Initially this was tackled by a piece-meal series of local measures, few of which had the intended effect of driving cases down.
At present, the second wave still appears to be increasing exponentially with the possibility of many more deaths occurring than in the first wave. In England this has resulted in the re-imposition of a national lockdown from 5 November, with Scotland, Wales and Northern Ireland also having major restrictions. With cases now appearing to be increasingly less rapidly, hopefully there will soon be a corresponding downward curvature to the death rates, but as of 15 November there is still no sign of it and daily deaths are approaching 50% of the first wave's peak. If the deaths continue to grow exponentially, the daily deaths will exceed 1000 in only a few weeks time.
I have introduced this second wave into the modelling (called Model 2) and the the total is the sum of the first wave and the second wave. This second wave is currently at an early stage so it is impossible to know how it will develop. Until only 2 weeks ago, the fitted curve was showing an exponential increase with no sensible maximum value. Now however there is a sign of a leveling off, and the fitted curve is now showing a more reasonable total number of deaths (60,000). This week this figure hasn't changed significantly and the 4 week lockdown is due to end shortly.
The fitted curves fit the actual deaths in the UK quite closely, both for the first and now second waves. There was however a small excess of actual deaths between about day 70 and 130.
I show below the Worldometer data for deaths to date for various different countries as a function of time. The curves show the fits to the available data using the principle of least squares, as for the approach used for the UK above.
The plot below shows the daily deaths for Spain together with the fitted curves for both the first and second waves of the pandemic (Model 1 & Model 2), and the total (Model 1 + Model 2). After a strong first wave, Spain is now suffering from a second wave that is increasing more slowly than in most of the other countries shown here. This fitted curve is still almost exponential with an implausibly high prediction for the total number of deaths from the model. The sum of the two models fits the actual daily deaths quite closely.
The plot below shows the daily deaths for Italy together with the fitted curve. The curve is a pretty good fit to the actual deaths. Back in March, the pandemic gripped Italy a few weeks before other countries, and the deaths rose very sharply before falling back quite quickly, possibly due to the strict lockdown measures imposed. Since then deaths stayed low, until about day 220 when an increase was first apparent.
The deaths in this second wave have now reached the maximum in the first wave but the rate of increase has decreased substantially in the last week. ,
The model 2 fitted to these values was showing an exponential rise with no sensible maximum value. However there is now a slight downturn in the rate of the daily increases, with the fitted total number of deathse more plausible at about 60,000 (down from 80,000 last week).
For France, the numbers were very erratic; on the plot below an average over the last 7 days is shown (as they are in most countries). Even with this averaging there are many fluctuations and the initial rise in deaths doesn't fill very well to the modelled curve. Nevertheless, looking at the plot as a whole, the fitted curve again is quite close to the actual deaths as in the other countries already shown above.
Recently, there has been a large rise in the reported cases in France, almost on a par to the figures for the start of the pandemic back in March. A large increase in deaths is now appearing in the reported figures, on a higher level than Spain. The daily deaths now exceed 60% of the peak of the first wave, but very recently have begun to level off and now turn down, presumably as a result of the second lockdown.
The modelling now has two curves for the different waves. As in the UK and Spain, the increases in deaths in the second wave are slower than in the first wave and there is now some downward curvature. The fitted curve is no longer showing an exponential rise with a very large total number of deaths, and is now indicating a total of about 60,000 deaths in the second wave - about double the first wave.
As can be seen below, the USA numbers do not fit well to a single curve. After the first wave, and having been in decline for some time, the daily figures started increasing again around day 120, no doubt following the well publicised large rise in reported cases in many states. This is presumably connected to the premature easing of the lockdown in many states. A second wave then occurred which appeared to peak around day 170. A slow second decline then took place before the latest upturn in deaths.
Uniquely in the countries featured here, the USA now appears to be experiencing a third wave in its national death rates.
Hence I have modelled the USA figures using the sum of three different curves fitted to the daily death figures. The first curve is intended to match (approximately) the initial stages of the epidemic while the second curve is to account for the rise in deaths after around day 120. For the first and second waves, the Excel Solver was fully up to the task of automatically finding the parameters for these curves. The second curve currently had a much slower time constant than the first, with a slightly lower predicted number of total deaths (115,000 for the second wave; 130,000 for the first). This would mean the final death toll would be around 245,000 in the USA.
I've now incorporated a third wave into the modelling to account for the latest increase in deaths. These are currently showing fluctuations up and down, but the fitted curve which is still showing an almost exponential rise with no sensible prediction for the total number of deaths. Unlike the European countries, this is not being met by a further national lockdow.
Unlike all the other countries featured on this page, Sweden has not had a major lockdown. Hence a comparison with other countries that have much higher levels of restriction in interesting. Sweden has a significantly smaller population (about 10 million) and a lower population density than any of the other countries given here. The daily figures (3-day smoothing) show a strong weekly cycle. The fitted curve suggested that the peak in the daily figures was reached on about day 45 and that a slow decline is now underway. The modelled total epidemic deaths is around 5,000. There is currently concern in Sweden over the current relatively high death rate per head of population and extended period of the slow decline in daily numbers. The Swedish figures were showing a slight 'fat' tail effect (i.e. above the modelled curve) but this effect declined and the rates were very low from around day 160.
However from day 230, it appears that a second wave is starting here as well, following the number of cases which had been rising for some time prior to that date. The modelling of this second wave is now shown below within the total modelled deaths curve (I have not shown the separate model 1 and model 2 components). With the daily deaths still rising there must be substantial uncertainties in the predicted total deaths in the second wave, but for what it is worth the curve suggests this total will be modest at only around 1500.
In Germany, the approach has been completely different from Sweden with a huge amount of testing and follow-up of those infected, combined with a major lockdown. At present, the fitted curve has a final total death toll of about 9,000, in stark contrast to the UK, Italy, Spain and France. The fitted curve shows the peak in the daily numbers was reached on about day 35 and then tailing off well, despite some easing of lockdown rules. There is however now a strong second wave.
The second wave modelling still shows a continuing almost exponential rise, and the daily deaths now substantially exceed those in the first wave - the only country shown here where this is the case, although of course the first wave peak in Germany was comparatively very low.
With the reportedly dire Covid-19 situation in Brazil much in the news in early summer, I took a look at their figures, which are shown in the plot below. For Brazil, unlike most of the countries looked at here, it is clear the actual deaths do not fit at all closely to the modelled curve.
Following the initial rise, there was an unusual plateau, with almost constant values from day 80 to about day 160. Since then there has been an apparent progressive decrease until around day 240. In the days since then, there is some evidence of a second wave although not as strong as in the European countries shown here.
There must also be some caveats about the accuracy of the official numbers for this country, which may be significantly under-reported.
For China the present phase of the pandemic is apparently over. It can therefore be used as useful test case for the modelling approach described above. The plot below the available data for China up to 50 days since the first death. Note immediately how closely the 2-parameter equation given above fits to the measurements, over almost the entire course of the pandemic. In this country at least, the equation seems to be more than plausible - it provides a good fit to the entire pandemic.
I have then tested how accurately the modelling approach predicts the final death toll, based on different amounts of the available data. The results are shown below. This graph shows how the predicted total deaths vary depending on the date at which the modelling is performed. All predictions were within a factor of two of the final value. This is unlike my experience with any of the other countries modelled to date. In all these cases, the predictions for the final death toll have been varying by large amounts, indicating the approach is not providing meaningful values.
It is however surely a coincidence that the prediction, performed on only the first six days of data (I excluded the values below 4 days from first death), is almost spot on the final figure.
|© All plots copyright Stephen Burch|