(5/2/20) There are probably as many different opinions on models as there are models themselves. None can forecast with total accuracy and if one could you wouldn’t know it until it is too late. Still we need some forecasting tools to guide policy and responses.
As evident throughout this blog, we have been developing a model based primarily on the most reliable data available, deaths. From death rate and cumulative deaths and other measurable variables, we can compute number of cases (prevalence) and new cases (incidence) as well as forecast these values as well as deaths into the future. A useful outcome of these tools is the ability to forecast dates when it is safe to resume some level of work and social activity.
However, how do we know if a model is performing well? That usually means running it on historical data. Even though, like trying to predict the stock market, past performance is no guarantee of future results. We are here in the middle of an epidemic and so we can gain some indication based on the most recent historical performance. We can also compare ourselves to other models. However, all models are different and give different results. Still we have been benchmarking to what is emerging as the gold standard model and is now clearly the most widely followed model by the press and government agencies. This is the model by the University of Washington’s (UW) Institute for Health Metrics and Evaluation (IHME). The details of this model have not yet been published and are not well known. We believe we have many similarities such as depending strongly on death statistics.
The IHME and our model also make updates to forecasts based on new data. This would seem to be a sensible approach, but many “expert” modelers and epidemiologists consider this strategy a shortcoming because the forecasts can bounce around. Well I began my model for the very reason that these so-called experts were pontificating things that seemed far fetched and their models seemed to be too complicated with many variables that are hard to measure or define. So, these models may be good for ultimately understanding pandemics better (after the fact), but we are in the middle of a pandemic and need some real-time guidance and forecasts so I believe the strategy that I, and apparently IHME, are taking are valuable. There is a need for many models as they all make different assumptions or are trying to measure or forecast different properties.
So, to cut to the chase the plots below show the forecasted total final deaths for the various hotbed countries and U.S. states that my model is following alongside the IHME model forecasts over the same time period.
The following are key observations:
The two models would appear to be more similar than dissimilar based on the relative qualitative agreement on death forecasts, e.g., the order of severity for different countries and states.
Both models fluctuate, but not excessively relative to an epidemiological model that is calculating based on first principles.
Both models seem to be on a slight rise in forecasting deaths. We think that is because both models a priori assume that the rise and fall behavior are symmetric, but evidence is now showing the decline is slower. We know how to modify our model to do this, but will not be able to do so before the pandemic mostly plays out. Instead we are applying an asymmetry factor to the downside to compensate. We do not know if IHME is doing anything similar.
Our latest forecasts are all above those of IHME, but not greatly making us think that the difference is due to our including the asymmetry factor for the first time in the last update.
Not shown in these plots but evident in the Weekly Update postings is that we tend to read the death rate data such that we believe countries and states are not as far past the peak as IHME indicates. We think this is due to our using a Gaussian function to monitor peaking, whereas they use an error function (integral of a Gaussian) to do so, which we believe is less sensitive to detecting changes, e.g., peaking. We both apparently use the error function to forecast total deaths but from different parts of the curve depending on how we read displacement from the peak.
(4/29/20) The world continues its recovery and the U.S., including the hot-bed states, appear to have reached their peak with the possible exception of NJ. Our advisory for the earliest date to implement easing of social restrictions range from 5/14/20 for WA to 6/2/20 for NJ.
Be sure to read Post 11. Recommended Guidelines for Easing of Social Distancing
We maintain the same format as our previous weekly updates We’ll continue our routine of presenting the death rates internationally and domestically and from that and other conditions and assumptions detailed in our Model from previous postings we forecast total deaths, prevalence and incidence of cases and easing dates. We also compare our forecasts with that of the heralded UW IHME model.
The plots below show the familiar death rate curves for hot-bed countries and for U.S. states. We dropped China and Korea as now being “uninteresting.” But we may add Sweden as a country that has flouted government mandated restrictions allowing voluntary behavioral changes. Though they claim it to be a success they are in my estimation in trouble with a rising death rate and already a cumulative death count of 222/million, which exceeds the U.S. at 172/million and is closing in on Italy, Spain, France, and the U.K. who range from 311/million (U.K.) to 503/million (Spain).
We did not upgrade any countries (3-color ranking) though the U.S., France and the U.K are on the cusp of one and we anticipate if trends continue, we will do so next week. We did upgrade NY from red to orange and MI and LA should get an upgrade next week.
We make the following comments:
Most countries and states have clearly advanced past the peak of the death rate curve. NJ appeared to reach that summit, but today had its greatest daily death count at 398.
We have noticed that most U.S. states and some European countries oscillate in their daily reported deaths with low numbers on the weekends and high numbers on Mondays and Tuesdays. So, we now see that some of the daily fluctuations may be due to reporting or data processing diligence.
The symmetric Gaussian model is starting to break down on the downside of the death rate curve progressing slower than anticipated. Of course, nothing is really anticipated as pandemics and how the population responds are not predictable. Consequently, we are planning to build in an asymmetry factor.
Next is our familiar table for forecasted total deaths, prevalence (current cases), and incidence (new cases) along with their values per capita (per million people) as well as dates we consider to be the earliest to begin a graduate easing of social distancing. We will continue to call this an easing date and not a safe date to dampen excessive hopefulness.
These values assume an asymmetry factor such that a slower than modeled decline in the death rate can be adjusted. We will look for better functional forms than a Gaussian in the future (after this all blows over), but for now we simply scale the weeks from peak proportionately back by where the last death points lie on the symmetric Gaussian. The true weeks from peak, however, are shown in the table.
We repeat from last week that as a rough rule of thumb the easing date cannot be before the point when the prevalence count drops to less than what it was when the death rate took off. Roughly the easing date should be about 4-5 weeks after the death peak, the range depending on how severe the outbreak was for a particular population. Our date is based on when the total active case count (prevalence) is forecasted to drop to 100/million. This should be adjusted for population density as this number for sparce populations is probably safe, but for dense populations, e.g., NY, it would be prudent to add a few days.
Continued public and political pressure will surely lead to premature easing, which as Fauci has said “could backfire.” So please read my posting: “11. Recommended Guidelines for Easing of Social Distancing” proposing a three-phased easing with check-gates at each step. And of course, please give me your thoughts.
We now close by comparing our results to that of the Institute for Health Metrics and Evaluation (IHME) at the University of Washington (UW), which has emerged as perhaps the leading model for informing our nation on the state of COVID-19 (http://www.healthdata.org/covid/).
We continue to track closely to the IHME model indicating that there must be components of each model that are similar. We tend to forecast a little earlier from peak rates and as a result we forecast somewhat higher total deaths. Whereas the accuracy of forecasts should improve as we traverse the peak in the death rate curve, we are learning that the downward side of the curve is not symmetric but stretched out in time, which will elevate the actual statistics relative to a symmetric forecast. As noted above we have added an asymmetry factor to account for this effect.
(4/22/20) The world continues to recover with all hot-bed countries showing declining death rates. The U.S. has turned the corner, but appears about 2 weeks behind the rest of the world, with the U.K. being the other laggard. Our estimates for when gradual easing of social restrictions has pushed out a few days since last weeks forecast.
Remember to check my post called “Daily Rumblings” for late breaking updates.
We’ll continue our routine of presenting the death rates internationally and domestically and from that and other conditions and assumptions detailed in our Model from previous postings we forecast total deaths, prevalence and incidence of cases and easing dates. We also compare our forecasts with that of the heralded UW IHME model.
Before we go on, please see the previous posting (just posted): “11. Recommended Guidelines for Easing of Social Distancing.”
The plots below show the familiar death rate curves for hot-bed countries (we may drop S. Korea and China in the future) and for U.S. states. We have upgraded the severity scale (3-color ranking) for some of these. The U.S. remains the only country still ranked as serious with a red spot. Several states are still in that category as well.
The qualitative red, yellow, and green rankings reflect accelerated, rolling over to a peak, and well on the decrease death rates, respectively.
We make the following comments:
Most countries and states have advanced past the peak of the death rate curve. Some that we have called at the top still need more data to strengthen that assessment
The Gaussian model is holding up reasonably well, but we might expect a slower decline than rise as new, but lower density outbreaks are triggered. We will look at final data before adjusting the model.
Next is our familiar table for forecasted total deaths, prevalence (current cases), and incidence (new cases) along with their values per capita (per million people) as well as dates we consider to be the earliest to begin a graduate easing of social distancing. We will continue to call this an easing date and not a safe date to dampen excessive hopefulness.
Assumptions: Mortality factor is estimated as 1.0% for most favorable populations (not yet strained health care system) and up to 2.0% for least favorable populations (strained healthcare system).
We repeat from last week that as a rough rule of thumb the easing date cannot be before the point when the prevalence count drops to less than what it was when the death rate took off. This is because we don’t have a vaccine nor is there sufficient herd immunity (those who have had the disease and developed antibodies) to change the vulnerability to new outbreaks. Roughly the easing date should be about 4-5 weeks after the death peak, the range depending on how severe the outbreak was for a particular population.
Now that the momentum to ease restrictions is gaining momentum and we are sure to initiate this prematurely, we must have a phased approach. The Administration has proposed something that includes many common sense recommendations, e.g., continue to practice good hygiene and advising sick people to stay at home. However, the three-phased approach is lacking in specifics, e.g., “bars may operate with diminished standing-room occupancy,” without defining density or distance requirements. So please read my posting: “11. Recommended Guidelines for Easing of Social Distancing” and please give me your thoughts.
Finally, we provide a new update on the comparison of our forecast of critical values to that of the Institute for Health Metrics and Evaluation (IHME) at the University of Washington (UW), which has emerged as perhaps the leading model for informing our nation on the state of COVID-19 (http://www.healthdata.org/covid/).
We appear to be tracking very closely indicating that there must be components of each model that are similar. We tend to forecast a little earlier from peak rates. As the death rate curve flattens the accuracy of forecasts improve greatly because it is more evident where in the rise and fall cycle a given population is. This is evident as our forecast Total deaths are starting to converge.
(4/21/20) Relaxation of social restrictions will happen and probably prematurely. Given that political reality, it is imperative to put into place a phased approach where each next phase is based on metrics to minimize new outbreaks.
Remember to check my post called “Daily Rumblings” for late breaking updates.
The U.S. Administration released guidelines for relaxing social distancing. There are many common sense recommendations, e.g., continue to practice good hygiene and advising sick people to stay at home. However, the three phased approach is lacking in specifics, e.g., “bars may operate with diminished standing-room occupancy,” without defining density or distance requirements.
The following suggested guidelines are a phased approach to relaxing social restrictions. In all cases it is still mandatory that people who are sick or feel sick should stay home. These recommendations are for healthy individuals only. I post this in hopes that it gets noticed and if found reasonable contributes to a collective set of guidelines that can be followed, monitored, and allow maximum freedom and economic opportunity without exposing the public to excessive COVID-19 risk.
Phase I – At recommended easy date
Phase II – 3 weeks later if no negative trends
Phase III – 3 weeks later if no negative trends
Work
– Employees work at office 2 days a week on two different cycles, e.g., Mon/Wed, Tue/Thu – Maximum density of employees is 1 per 100 square feet – Mandate use of hand sanitizer entering and exiting work – Place hand sanitizers in many locations (at least 1 per 1,000 square feet) and every bathroom or breakout room – Minimize contact and exercise caution
-Employees may return to work full time – No other changes
– No other changes
Schools
– Healthy students may return to classes – Maintain 6-foot distance in class rooms or lecture halls – Mandate use of hand sanitizer entering and exiting class rooms – Minimize contact and exercise caution
– No other changes
– 6 foot distance restriction may be relaxed – No other changes
Restaurants
– Open with every other table unoccupied – Bar area closed to customers – Mandate use of hand sanitizer entering and exiting establishment – Hand sanitizer at every table and bathroom – Servers must wear masks and gloves
– All tables may be served – Bar area open only for sitting and stools spaced 6 feet apart – Hand sanitizers on bar every 12 ft apart – No other changes
– Remove all restrictions, but: – Maintain signs to minimize contact – Maintain hand sanitizers
Retail stores (e.g., groceries, mall, etc.)
– Open all stores except those requiring contact (e.g., hair dressers, manicure, etc.) – Place signs stating rules against congregations – Place hand sanitizers in many locations (at least 1 per 1,000 square feet) and every bathroom or breakout room – Assign security to resolve unlawful congregations or rule violations – Employees must wear masks and gloves
– Open all stores – No other changes
– Remove all restrictions, but: – Maintain signs to minimize contact – Maintain hand sanitizers
Stadiums (sports concerts)
– Not allowed
– Allow ¼ capacity and space seating for different groups – Mandate use of hand sanitizer entering and exiting – Hand sanitizers in every bathroom – Cone off every other parking space (this may be difficult), at the very least place signage to forbid adjacent parking.
– Remove all restrictions, but: – Maintain signs to minimize contact
Beaches
– Open beaches and parking lots. – Cone off every other parking space (this may be difficult), at the very least place signage to forbid adjacent parking. – Place signs stating rules against congregations – Station police officers every half mile to resolve unlawful congregations or rule violations
All parking acceptableNo other changes
– Remove all restrictions, but: – Maintain signs to minimize contact
Gyms
– Aerobics or cycling classes forbidden – Limit density of attendees to 1 per 100 square feet – Mandate use of hand sanitizer entering and exiting gym – Place hand sanitizers in many locations (at least 1 per 1,000 square feet) and every bathroom or breakout room – Minimize contact, and exercise caution
– Aerobics and cycling classes allowed, but with 6-foot distancing – No other changes
– No other changes
Other
– All service people must still wear masks and gloves – In general, when out avoid handshakes and any touching with strangers
– No other changes
– Service people may dispense with masks and gloves – No other changes
Ironically, the safest time for an individual to socially ease is just before it becomes practiced by all as cases will surely go up at that point.
Doing the math on 22M new unemployment claims and 45,000 deaths we come up with a ratio of about 500 unemployed per death. So, what is the ratio that can be tolerated? Would doubling the deaths to halve the unemployment be worth it? That would be then sacrificing a life for 250 jobs. Then we get into moral issues such as what if it is an elderly death. Should the figure of merit be how many ‘years’ of life do we sacrifice? That’s a question that someone must be answering somewhere.
I would say whatever date one considers safe for social easing add 2 weeks just to be safer and less risky of an infection rebound; as Fauci says if we relax too soon it “will backfire”. Further our testing capacity is still woefully inadequate to test for small outbreaks that could lead to large outbreaks. However, the country will succumb to economic pressures to re-open so we need a thoughtful prudent plan.
(4/15/20) Nearly all hot-spot countries and U.S. states are near or past the peak for death rate. That means the number of active cases (prevalence) is on the decline and some amount of social easing can be reasonably considered. However, until our population is largely immunized by having had the disease or by vaccination, social interactions cannot return to normal.
Remember to check my post called “Daily Rumblings” for late breaking updates.
Our model has been described in previous posts so we can spare that detail for now! Let’s put it to further practice and use it to help inform us on when we can start to relax social distancing and to what extent. We begin by showing the latest death rate plots for hot-spot countries and U.S. states below. Now that these curves have developed toward and past their peaks we superimpose a Gaussian function to visualize the progress made by these populations. Note that we have de-rated the severity of several of these populations as represented by the colored circles. Major recovery is evident, and as predicted, about 3 weeks after serious social distancing was implemented.
The qualitative red, yellow, and green rankings reflect accelerated, rolling over to a peak, and well on the decrease death rates, respectively.
We make the following comments:
The Gaussian dependence in most cases represents well the reported death rate data (even while fixing the width at half max to 4 weeks). This dependence, and the symmetry assumed by the Gaussian, is expected to get distorted due to interventions to reduce the rate, e.g., social distancing. The breakdown in the fit is evident for those populations who were most aggressive, such as China and S. Korea, the latter never reaching exponential growth and the death rate instead being more constant.
Once a country reaches the peak in death rate, it is about half way to its final death count.
Italy and Spain are making remarkable progress and we (perhaps prematurely) have upgraded their situation to green (out of trouble). How they manage their retreat from social isolation will determine how successful they will be in the long term. This is the topic for below.
The U.S. is lagging these other countries in reaching a peak; however, it does appear that the peak is imminent.
Regarding the U.S. states: Washington is past its death rate peak and has been upgraded to a green. California never saw high death counts, just high death rate that appears to be peaking so we upgrade that to yellow (warning).
We have not followed MI, NJ, and LA from the beginning so we have limited data to fit to the Gaussian, but each of these states is making progress and if the current trends continue over the next week, they will all be upgraded as well.
We now present our familiar table for forecasted total deaths, prevalence (current cases), and incidence (new cases) along with their values per capita (per million people). We also add a new column for the date we consider to be the earliest each population base can begin relaxing social distancing. We will tentatively call this an easing date and not a safe date so as not to conjure up excessive hopefulness.
Assumptions: Mortality factor is estimated as 1.0% for most favorable populations (not yet strained health care system) and up to 2.0% for least favorable populations (strained healthcare system).
A rough rule of thumb is that the easing date cannot be before the point when the prevalence count drops to less than what it was when the death rate took off. If that was about 4-6 weeks before the death rate peak, then one might think it should be about 4-6 weeks after the peak since the rise and the fall is approximately symmetric. However, as I showed in the previous Post #9, the incidence and prevalence curves precede the death rate curve by about 2.5 and 1.25 weeks, respectively, and that results in the prevalence count coming down to its say minus 5-week mark at about 4 weeks after the death rate peak (this time accounts for decreasing incidence and recovery from the disease). Now this date would not be safe because it corresponds to a prevalence that previously set off the exponential growth in death. However, if we exercise some precautions then sometime soon after 4 weeks may be considered safe.
For our purposes, we assume that as an absolute minimum condition to consider some social relaxation, that the prevalence must drop below 100 active cases per million (i.e., 1/10,000 people). One would still have to ensure that close contact with strangers is minimized and voluminous testing must be continued among other moderation. Frankly, we can never return to normal until some high percentage of a population is immunized either by having had the disease or by vaccine. A ballpark figure is about 50%, but no population yet has had more than 10% infected (Projected by 6/1/20: Italy ~ 4%, Spain ~ 5%, U.S. ~ 2%, NY ~ 10%).
Main comments are:
Without making a pretense about safety, our forecasts for when it is reasonable to consider social relaxation are given in the right most column in the above Table.
China and S. Korea are already relaxing social isolation so the rest of the world has many weeks to observe the prudence of their approach and decide on some combination of emulating and modifying.
Iran would appear to be the next country to drop below the 100 per million prevalence threshold (4/29/20). Italy and Spain follow next in the beginning of May and the U.S. and U.K. not until mid May.
Regarding the U.S. states it is not prudent to consider social relaxation for any of the hot-spot states before mid May except for CA and WA for which early May may suffice. NY will not reach an easing date until about 5/19/20.
Finally, we provide an update on the comparison of our forecast of critical values to that of the Institute for Health Metrics and Evaluation (IHME) at the University of Washington (UW), which has emerged as perhaps the leading model for informing our nation on the state of COVID-19 (http://www.healthdata.org/covid/).
We appear to be tracking very closely indicating that there must be components of each model that are similar. We tend to forecast a little earlier from peak rates. As the death rate curve flattens the accuracy of forecasts improve greatly because it is more evident where in the rise and fall cycle a given population is. This also accounts for the better agreement between the two models relative to previous weeks.
(4/8/20) A recovery is in sight for most of the world but the U.S. continues to lag. False hope for NY. We are probably at near maximum active cases nearly everywhere and must exercise strictest social isolation.
Remember to check my post called “Daily Rumblings” for late breaking updates.
This is not intended to be a technical discussion, but I do need to present the underlying model for you nerds out there. It really is not complicated, but you can get to the main conclusions by just looking at the Figures and reading the bullet points.
We present our weekly update and introduce an extension of our Gaussian model, which was previously shown (Post 8. A Simple Model for Forecasting Final Fatalities) to be a useful working model for forecasting total deaths following recovery and is now extended here to forecast present and future active cases (prevalence) and new cases (incidence). First, we show below the latest death rate plots for hot-bed countries and U.S. states.
The qualitative red, yellow, and green rankings reflect accelerated, rolling over to a peak, and well on the decrease death rates, respectively.
Internationally, the good news is that Italy and Spain are now joining Iran as showing strong evidence for reaching a peak in the death rate. Our previous model showed that at the peak of the death rate curve a population has reached half of its terminal death count (assuming a symmetric path down). Iran, oddly, seems stuck at its peak and we hope to see a decrease in death rate soon. The U.S., France and U.K. continue on an accelerated death rate and it is hard to tell if it has transitioned from exponential to linear, but certainly no evidence of rolling over on its way to a peak.
Domestically there is not much to be hopeful about. At best Washington and Louisiana may be showing a rolling over in death rate and hopefully will reach a peak soon. New York is horrific; after Mayor Cuomo hesitantly stated that the situation was improving with two consecutive days of lower death rate, yesterday the death rate spiked again to a new high. New Jersey and Michigan are growing fast (though NJ may be slowing). California still has a steep death rate, but the total number of deaths are significantly less than other states ranked red in severity.
Based on these reported death rate data we can apply our Gaussian model (Post #8) to update our forecast for terminal (total) deaths and present-day active cases and new cases. The general premise as described previously is to use death rate data and an assumed rise and fall curve (Gaussian distribution) to forecast total deaths by estimating where on the Gaussian curve the actual death rate curve lies. The remainder of the curve can then be integrated (area under the curve) relative to the current date/position and the current death count. We make a couple of other assumptions to compute the number of current active cases and rate of new cases. We assume COVID-18 lasts about 2.5 weeks resulting in either recovery or death (ample reported studies show this). We also assume that the Gaussian rise and fall has a width at half its maximum of 4 weeks, which is consistent with the results for China (plot above) and for the 1918 Spanish Flu. So, then the current number of deaths on a given day would reflect the number of new cases 2.5 weeks earlier divided by the mortality factor.
Left: Plots of new events below (rate, e.g., per day) and total events above for deaths and COVID-19 active cases. Right: The Table shows how to calculate terminal fatality, prevalence, and incidence by determining where the real death rate data lies on the rate curve relative to the peak. Then one multiplies the current total deaths by the factors shown to get the other values.
The lower plot below shows the model curves for daily deaths and new cases. The Gaussians are plotted on a logarithmic intensity scale. In the plots below we assume a mortality factor of 1.0%, but this can be varied easily as we show soon. The units for the axis are relative, but we have chosen conditions so that the Relative time is in units of weeks and the Relative counts scale with the number of deaths per unit time (e.g., daily).
The upper plot is an integration (area under the curve) of the lower plots. For total deaths the integration reaches a plateau after death rate approaches zero. For total cases, the recovery of patients means that this count goes to zero as the rate of new cases approaches zero. The plot of total cases with time is very important for forecasting when the prevalence of cases decreases to a safe enough level to allow the full or partial relaxation of social restrictions.
Some key observations regarding the above plots:
The peak for prevalence (current active cases) peaks about half-way between the peaks for death rate and incidence (new cases).
At the peak of death rates, the prevalence is only down about 20% from its peak, so when the death rate is rolling over and reaching a peak, the prevalence is still near its maximum and the population needs to be exercising its greatest social restraint!
We recommend strict adherence to social distancing beyond 2 weeks after a population reaches its peak death rate. At that time the prevalence is at about 22% of its peak value. This value goes down to about 7%, 2%, and 0.3% at weeks 3, 4, and 5, respectively. Relaxing of social distancing may be acceptable in less severe states, e.g., CA at week 3 after the peak, but for severe states, e.g., NY, week 5 would be more prudent.
Now we look at the forecasted values for terminal deaths, current active cases (prevalence) and current new cases (incidence). The method for computing these values is described in the caption to the Plots above.
Assumptions: Mortality factor is estimated as 1.0% for most favorable populations (not yet strained health care system) and up to 2.0% for least favorable populations (strained healthcare system). The uncertainty in these values is greatest the furthest from the peak is the death rate. For example, the U.S. uncertainty range is about 30,000 – 120,000 (factor of 2x), whereas Italy and Iran are about ± 25-50%.
Key international observations from the above Table are:
The total (terminal) fatalities are consistent with last weeks forecast, however, for the most part on the lower limits as evidence of approaching or reaching a peak became evident.
The U.S. is forecasted to exceed all other nations in total deaths.
France is forecasted to exceed all other nations in total deaths per capita.
According to the model the current number of active cases in the U.S. is about 1% of the population in the U.S. (10,833 per million), in France about 3%, and in Italy surprisingly <1% as they have progressed past the death rate peak. The peak prevalence for Italy is computed to have been about 2-4% of the population.
Key domestic and general observations from the above Table are:
New York is projected to lead any nation in total deaths per capita. Its current prevalence is about 5% of the population.
California, which appears in news report as a hot-bed state appears not to be so relative to the other five states highlighted. CA is forecasted to have significantly the lowest per capita total death than the other states.
The current number of active cases in CA is about 0.3% of the population or about 1 in 330 as contrasted with 1 in 18 in NY.
Current prevalence (active cases) are calculated in the above Table to be typically about a factor of 10x greater than confirmed cases. Again, as I’ve explained many times before, confirmed cases are a poor indicator of progress as it strongly depends on the rate of testing, which has been insufficient at best. If one is surprised or dubious about these high prevalence numbers, please consider the following facts:
Once we are well on our way to recovery, new antibody tests will enable a determination of the percentage of the population who had COVID-19 by detecting immunizing antibodies in COVID-19 recovered individuals.
We now compare our total deaths and time respective to the peak death rate to that from the highly regarded University of Washington (UW) model (http://www.healthdata.org/covid/).
The comparison is reassuringly more in relative agreement than disagreement (within a factor of 2x in all but two cases). Noticeable differences include:
We believe that France has not reached its death rate peak. Referring to the plots at the top of this post we believe that the spike a couple of days ago and the subsequent lowering in the death rate was an aberration. A couple of more days will tell. We therefore forecast about 3.5x as many total deaths as UW. [Update: We learned that the spike to 2,000 deaths/day for France was due to a lump addition for deaths unaccounted for in nursing homes. This would mean this should be redistributed to early days thereby giving a curve that may be rolling over closer to the peak. We expect to revise our estimate downward on our next reporting.]
We also believe for the same reason that NY experienced a false peak and we therefore forecast about 2x more total deaths.
Interestingly a week ago when we did our first comparison to UW they forecasted 5,068 total deaths for CA well above our range of 1,122 – 3,829. They have since considerably lowered that forecast to 1,611, which is now more in-line and even below our latest forecast of 2,209.
Finally, I show the interesting log-log plot first shown at the end of Post #7 (Weekly Update: Grim News). This is now showing some deviation from the line for Italy and Spain as also evident in the death rate plots at the top of this posting. Hopefully these two most serious nations are on their way to recovery.
(4/3/20) I believe models should be as simple as possible and rely as much as possible on hard data, e.g., deaths
Always check my post called “Daily Rumblings” for late breaking updates.
The Gaussian Model introduced in my last blog (#7) can be extended to forecast the number of fatalities that will occur as the epidemic in a particular population reaches recovery. If one is monitoring the death rate per unit time (days, weeks, etc.) then one can match the shape of that curve to a Gaussian growth and recovery curve and determine how far up or down the curve the actual data lies. Based on the number of fatalities that has occurred on the latest date, one can extrapolate how many more deaths will occur after traversing the entire curve to recovery. The Figure below shows how this works.
Gaussian fatality model using the observed death rate data for Italy. The horizontal axis numbers represent week for convenience, but in fact this model is not dependent on time.
This model assumes that the rate of deaths (and case prevalence also, if one could only measure that well) will follow a rise and then a fall. A Gaussian model works well because it begins to rise exponentially then becomes relatively linear before rolling over and peaking. The recovery is then assumed to following a similar trend in reverse as shown in the bottom plot above. The death rate data for China (Post #7) bears this out. Now the total number of deaths up to a particular point on the Gaussian rate curve is obtained by integrating all the deaths to that point and is shown by the middle plot above. Now if one knows where on the Gaussian rate curve a particular population lies, then the final death count can be extrapolated from the current death count. The factors that convert current deaths to final deaths are shown in the top plot above and the Table on the right.
We show by example the case for Italy. The death rate in Italy has been rising exponentially, but is beginning to show a perceptible slowing from pure exponential growth (pseudo-linear region). These daily death counts are overlaid on the Gaussian rate curve as best as we can visualize. There is a large uncertainty particularly in the near-linear region of the Gaussian such that we could easily place the Italy data such that the last date overlays with week 4.5 rather than week 5.5 as shown. We therefore define this as the uncertainty boundaries for extrapolating to final fatality forecasts. The dotted lines represent these two limits and by tracing up to the multiplicative factor on the top plot we can calculate a final fatality based on the current total deaths.
The results of this model for our highlighted countries and U.S. states is tabulated in the Table below for observed death rate and total deaths as of 04/01/2020 (See Post #7 for these results).
For a particular population, the lower the number of the week on the curve the further from recovery is that population and the greater is the fatality factor relative to the current total deaths. The following observations can be made:
The U.S. total fatality is projected to be between 74,365 and 391,502. The large uncertainty is because the current death rate is still on the steep part of the Gaussian curve.
China is already near full recovery so the 1-week uncertainty is literally about 12 deaths out of over 3,000.
Iran appears at the top of the death rate curve, which projects to a doubling of the current deaths as it progresses down the rate curve.
Regarding the severity in different countries, the U.S. is projected to have the largest final death count in the world, though Italy, Spain and France are projected to have greater death counts per capita (expressed as per million in the above Table).
Regarding the U.S. states, Washington is furthest along the fatality (Gaussian) rate curve and should peak shortly. New York is still in dangerous territory still exhibiting an exponential death rate. California is progressing further along, but still near exponential. New York is projected to have a final per capita fatality count of greater than 10x that of Washington and California.
There have been a number of reports of projected deaths in the news, some outlandish as they do not assume any social isolation reductions and many that include a host of variables. Our U.S. administration is now projecting 100,000 to 240,000 total deaths, which fits between our uncertainty limits. The University of Washington updates their projections nearly daily and currently forecasts the following (https://covid19.healthdata.org/projections):
U.S.: 93,531 people and 13 days from the peak death rate. This lies at the bottom end of our range and we forecast about 3 weeks from the peak.
Washington: 978 people and 7 days from the peak. This lies at the bottom end of our range and we forecast about 1.5 weeks from the peak.
New York: 16,261 people and 8 days from the peak. This is below our bottom estimate and we forecast about 2.5 weeks from the peak.
California: 5,068 people and 24 days from the peak. This is higher than our upper forecast. They apparently believe that CA is further from a peak than our estimate of 2 weeks.
There are several caveats and assumptions to this model:
Death rates may not follow a Gaussian nor do they necessarily follow a symmetric rise and fall. However, historical data, such as China for the current epidemic and data from the 1918 Spanish Flu appear to follow near Gaussian behavior.
We make no assumptions regarding social distancing, other interventions, anti-viral treatment, etc. We assume these are all embedded in the reported death rate data.
We assume reported death rates and totals are accurate. They are certainly more accurate than reported case prevalence and incidence, which is heavily dependent on testing and generally vastly understated relative to the real numbers.
This model does not have a time component to it. In fact, virus epidemics know no time. However, for convenience we have expressed the Plots above in terms of numbers that as best as we can deduce from observed data represent weeks.
The utility of this model is that it is based solely on hard data, namely deaths and doesn’t rely on less certain variables. As each country and state moves up and over the curve, we will be able to refine the final fatality projections and reduce the uncertainty.
(4/1/20) I’m a little tardy with my weekly update for a number of reasons: (i) I was hoping to see some rolling over of the near exponential growth in the death rate so far still evident in almost every country and U.S. state last week, except China and S. Korea, and was waiting to report that, and (ii) I have a day job!
Unfortunately, the news remains grim for nearly all of the countries that we follow.
The Model for China
The Figure below shows the pattern of daily and total deaths for China. This pattern is what we hope to see in all countries. If social distancing is initiated soon enough then one should see a peaking in daily deaths about 3 weeks later (the time from infection to death). This is a big if and failing in many countries.
The spike in the data is due to a redefinition in the reporting of deaths between 2/12 and 2/14/2020.
Our Death Rate Model
The Plots below represent a model for death rate and accumulated death. For our model we assume a Gaussian (Bell-curve) distribution for the pattern of increasing followed by decreasing deaths per day and the total deaths (which for your math nerds is the integral of the death rate). Other distributions can be used and some may be more appropriate, but a Gaussian starts rising exponentially followed by a near linear region and then rolls over to a peak, which is behavior we expect from an epidemic. My model is primarily for tutorial purposes, but might also have some predictive capability. This should look familiar to some as it resembles the so-called “flattening the curve” discussions you have read about. Frankly, I’m not sure I subscribe to the currently trending version of the flattening of the curve theory as that says if you exercise reduced human exposure you peak later but at a lower death rate. I think you must peak earlier not later if social distancing is working. Regardless, both theories do predict a much greater peak death rate by delaying social distancing.
According to our model and the discussion above, if social distancing is working then we should see the peak in the death rate about 3 weeks later. Sadly, once a population reaches the peak in death rate, they are only at the halfway point for total deaths as there will be just as many deaths on the way down the curve as going up it. (assuming the behavior is symmetric). You will notice that our plots resemble what was observed in China (Figure above).
One reality that needs to be stressed is that if a population delays social distancing by just a couple of weeks it can have a profound effect on the total deaths. In our model we show two cases: social distancing starting on 3/1/20 vs. 3/15/20. We allow for the growth rate to be similar in these two cases, but the former case will show a “flattening” of the curve sooner than the latter case, which can be seen occurring about a month after 3/1/20. Assuming this model has some validity, it predicts that by delaying social distancing by just two weeks will lead to a death rate peaking two weeks later and at a 5-fold higher value. The accumulated deaths are plotted on the right plot. For these two cases the total deaths are 50,000 and 250,000, respectively. [Note: This model assumes a width at half max of 6 weeks, when in fact historical trends suggest a narrower distribution. This would make the severity of a two-week delay even greater than a 5-fold factor.]
So where does the U.S., who waited until 3/16/20 to implement social distancing, fall on these plots. Well over the past week the death rate was 3,280/week and on 4/1/20 the number of deaths exceeded 1,000/day, which is now the highest in the world, which projects to at least 7,000/week by this coming week. Sadly, since the blue curve peaks at 7,000 deaths/week, and the U.S. death rate trend (Plot below) is nowhere near a peak, we are probably more closely following the orange curves. This would then project a total death count as falling between the 50,000 and 250,000 levels but much closer to the latter. We are in grave danger. If there is any solace in this conjecture it is that we will reach a peak in early May and pronounce a recovery by June/July.
I solicit critiques on this model and/or referrals to other models by the presumed many experts out there.
Weekly Statistics Around the World
OK now for more grim statistics. My blog of 3/23/20 (Blog #6) made forecasts for deaths and new cases using not an exponential, but a binomial, function, which I believed would be more reasonable for allowing for some slowing from social distancing. I was very wrong and most countries continued on their exponential death growth rate. New cases may already be tailing off, but we won’t know that for sure until 2-3 weeks after from the recorded deaths. Given that we are about 2-3 weeks into social distancing in most countries (at least ours started 3/16) I had expected some tailing and maybe peaking by 4/1/20. This is not generally seen, certainly not yet in the U.S., but there are some encouraging signs in other countries. Below are plots of growth rates for the most seriously affected countries.
Key observations are:
The U.S. death rate continues to grow at an alarming rate. Other countries not showing a reduction in the growth rate include France and the U.K.
Spain may be showing signs of a rate decrease, but it is too soon to tell.
Iran particularly and to a lesser extent Italy are showing signs of rolling over on the death rate curve.
S. Korea and China are of course models for us all in being well past their death rate peaks. We should be keeping an eye on whether they have a death rebound at some point so we can learn from that.
In the Bar Chart below we look at growth rate statistics in a visual way by showing how countries are doing in reducing their death rate by plotting the percentage change for one week over the next. We also place a qualitative symbol for countries in control (green), starting to control (yellow), and still no evidence of control (red).
Late breaking: Here is a better way to visualize trends in death rates. (I owe this to a great video that you should watch: https://www.youtube.com/watch?v=54XLXg4fYsc, though it plots cases, and not deaths, which is a less precise measure). By plotting death rate vs. cumulative deaths on a log-log plot one can immediately see deviations from exponential growth representing the start to recovery.
Key observations are:
Similar conclusions are reached compared to the bar chart above.
China and S. Korea have significantly decelerated their death rates. S. Korea as noted in earlier blogs never really reached exponential growth having gotten well ahead of the epidemics usual pattern of growth.
Iran seems clearly on the path to recovery (also evident in the daily death rate plots above) and there is a hint now over a week that Italy may be starting to roll over on the death rate curve.
The U.S., Spain, France, and U.K. are still showing approximately exponential growth.
(3/24/20) Still no evidence of a turnover in the death and infection rate, but we’re only one week into serious social distancing in the U.S.
In this post we update our death and prevalence/incidence statistics of a week ago (Post 3). This post has a lot of data so don’t get intimidated or bogged down in the detail. I will summarize the key points and refer you to where in the Tables and Figures to look. At the end of this post are plots of the cumulative reported deaths each day for eight countries and a description on how they lead to the values in the Table below.
– Reported values are highlighted in light yellow, – Forecast based on low and high estimates from binomial (low) and trinomial (high) functions fitted to the cumulative death plots (below) – Prevalence for 3/23/20 accounts for recovery, which is approximated by prevalence 3 weeks ago (3/2/20) – Mortality rate is assumed to be 1% for the U.S., China, and S. Korea; 1.5% for France, U.K. and WW; and 2% for Italy, Spain, and Iran. These values are estimated based on the quality of healthcare at this time.
Before we begin our worldwide assessment from the above Table, let’s look at the conditions in the U.S. as exemplified by the plots below.
The binomial curve fit (dotted lines) may understate the death forecast, so we use this as a lower limit.
In just one week the cumulative deaths in the U.S. have increased from 69 (3/16/20) to 471 (3/23/20). This is nearly a 7-fold increase and represents a doubling about every 2.5 days. Further the daily rate is increasing at a staggering clip with a daily death rate now exceeding 100. With these trends in mind we can now summarize the above Table. Key observations are:
The number of deaths in the U.S. is forecasted to exceed 1,000 over the next week and if no downtrend in the incidence of COVID-19 occurs due to social distancing and other government interventions, the death rate will double each week for another two weeks.
Acceleration of death rates is still occurring for the 8 countries tracked here except for China and S. Korea, who appear to have successfully contained the outbreak.
The above Table also shows deaths per million people, with numbers ranging from 90.6 for Italy down to 1.4 for the U.S.
Spain is accelerating at the rate of Italy, but delayed by 11 days. This is cause for great concern.
Because social distancing started in earnest about a week ago, we expect to see a de-acceleration of deaths in 1-2 weeks (which assumes that infection precedes death by 2-3 weeks; we use 3 weeks in our models).
The Table above gives prevalence calculated for 3 weeks ago from the cumulative deaths today assuming mortality rates in the footnote to the Table. Based on the death growth rates we then forecast prevalence today correcting for recoveries, which are assumed to be close to the 3-week-old prevalence numbers. Most troubling is Italy and Spain where we forecast an infection rate of about 1 per 60-70 people. This high density of infections will make social isolation less effective than for other countries.
Also shown are the reported prevalence of active cases. It can be seen that our forecasts based on reasonable assumptions are typically about a factor of 10x greater than reported, except for China and S. Korea. This indicates that lack of testing is a serious shortcoming to understanding the true extent of the epidemic and reported case values should simply be ignored as not being connected with reality. All forecasts need to be connected to hard data, namely deaths.
The incidence forecast similarly is a few to >10x greater than reported new cases for the same reason described in the above bullet.
The big question is whether social distancing and other government interventions are working outside of China and S. Korea. Depending on the fortitude of each nation to adhere to these strict measures we should see improvement. Turnover in new cases, due to social distancing, may already be occurring, but it won’t show up in the data for new cases because of the backlog of existing cases that have yet to be confirmed by tests. In fact, the term new cases is a misnomer because they more represent new detection of old cases. There are no reliable leading indicators or current measures to tell us whether we are succeeding. We must wait 2-3 weeks for the death statistics to show this of which we are about 1 week in for the U.S. and maybe a little more for Italy.
Conclusions:
Our analysis, based on death statistics and trend analysis, provides a more realistic assessment of the scope of the COVID-19 epidemic vs. reported cases, which vastly understates the true prevalence.
At this time, we do not yet see any evidence of de-acceleration of deaths and therefore incidence, but it may be happening, just that we are still be 1-2 weeks too soon to see this in the death statistics.
The success achieved by China and S. Korea gives us hope that containment will ensue throughout most of the world.
The dotted curve is the extrapolation of the reported deaths up to three weeks ahead of present. The function shown is a binomial equation, which accelerates less severely than an exponential function. Our eyes tell us that the binomial may not be fully capturing the acceleration and exponential may be better, but we are confident the rate will start to subside due to extensive government intervention and isolation. Still we consider the binomial extrapolation to represent a lower limit and we use a trinomial equation (steeper acceleration) for the upper limit and then take the average [geometric mean, sqrt(low x high)].
(3/22/20) Washington State has gotten control, California is making progress, but New York is concerning
The United States has the 6th largest death rate among countries in the world due to the corona virus, but how are we doing? Let me say again that deaths are a lagging indicator since these events represent infections 2-3 weeks ago. But it is the only hard data we have, so our strategy is to monitor the trend in deaths in order to extrapolate the death and the prevalence from 2-3 weeks ago to today and further into the future. The U.S. looks to be spiraling out of control in terms of accelerating number of deaths (which again reflects incidence of infections 2-3 weeks ago) and number of confirmed cases (which is meaningless because it mostly represents the amount of testing and not new cases). Let’s look at the Plots below for what we know in terms of cumulative deaths and death rates (daily) for the three most affected states, Washington, New York, and California.
Total death and death rates with binomial fits representing high end of forecast.
These data show accelerating (increasing curves) deaths for NY and CA, but less so for WA. The equations that are fitted to the data, in order to forecast to the present and the future with regard to actual prevalence of cases (vs. reported confirmed cases), are binomial equations. This represents our upper limit for forecasting and a linear fit (not shown) represents our lower limit. The Table below summarizes death totals and rates today and what we forecast for up to 3 weeks from now. We also present a calculation of prevalence and incidence 3 weeks ago reflecting death statistics today and then calculate prevalence and incidence today.
The model forecast for prevalence and incidence does not include recoveries, which is small for an accelerating epidemic, but significant for a leveling or declining epidemic.
The key observations are as follows:
Total deaths are calculated as the geometric mean of the low and high estimates discussed above [sqrt(low x high)]. The trend is increasing approximately linearly in WA, but accelerating significantly in NY and CA, with NY totaling about a factor of 4-5 greater than CA.
The margin of uncertainties for total deaths are also given in the Table and you can see they are least for WA and greatest for NY and not surprisingly increase with time into the future.
The calculated prevalence is significantly greater than the reported confirmed cases that we all read about. Our calculation of prevalence is based on a 1% mortality rate in NY and CA, but 3% in WA given that we know the vast majority of deaths there were for the elderly.
Conclusions:
Reported confirmed cases of prevalence and incidence are misleading indicators as they represent a fraction of total and new cases.
WA appears to be containing their epidemic.
NY and CA are increasing at similar rates; however, CA is at a level of about 20-25% of NY and therefore will have an easier time containing the epidemic than NY because the probability of exposure is proportionately less.