This coronavirus model keeps being wrong. Why are we still listening to it?

White House Coronavirus Task Force coordinator Deborah Birx points to a model that estimates coronavirus cases and deaths in the US on March 31. | Mandel Ngan/AFP via Getty Images

A model that the White House has relied on has come under fire for its flawed projections.

How many people are likely to die in the United States of Covid-19? How many hospital beds is the country going to need? When will case numbers peak?

To answer those questions, many hospital planners, media outlets, and government bodies — including the White House — relied heavily on one particular model out of the many that have been published in the past two months: the University of Washington’s Institute for Health Metrics and Evaluation (IHME).

The model first estimated in late March that there’d be fewer than 161,000 deaths total in the US; in early April, it revised its projections to say that the total death toll through August was “projected to be 60,415” (though it acknowledged the range could be between 31,221 and 126,703).

The model has been cited often by the White House and has informed its policymaking. But it may have led the administration astray: The IHME has consistently forecast many fewer deaths than most other models, largely because the IHME model projects that deaths will decline rapidly after the peak — an assumption that has not been borne out.

On Wednesday, the US death count passed the 60,000 mark that the IHME model had said was the likely total cumulative death toll. The IHME on April 29 released a new update raising its estimates for total deaths to 72,433, but that, too, looks likely to be proved an underestimate as soon as next week. Even its upper bound on deaths — now listed as 114,228 by August — is questionable, as some other models expect the US will hit that milestone by the end of May, and most project it will in June.

One analysis of the IHME model found that its next-day death predictions for each state were outside its 95 percent confidence interval 70 percent of the time — meaning the actual death numbers fell outside the range it projected 70 percent of the time. That’s not great! (A recent revision by IHME fixed that issue; more on this below.)

This track record has led some experts to criticize the model. “It’s not a model that most of us in the infectious disease epidemiology field think is well suited” to making projections about Covid-19, Harvard epidemiologist Marc Lipsitch told reporters.

But if that’s the case, how has it risen to such prominence among policymakers? Other models have done better than IHME at predicting the course of the epidemic, and many of them use approaches that epidemiologists believe are more promising. Yet it’s the IHME model that has generally guided policymakers, for the most part in the direction of focusing on a return to normal.

One potential explanation for its outsize influence: Some of the factors that make the IHME model unreliable at predicting the virus may have gotten people to pay attention to it. For one thing, it’s more simplistic compared to other models. That means it can be applied in ways more complicated models could not, such as providing state-level projections (something state officials really wanted), which other modelers acknowledged that they didn’t have enough data to offer.

Meanwhile, its narrow confidence intervals for state-by-state estimates meant it had quotable (and optimistic) topline numbers. A confidence interval represents a range of numbers where the model is very confident the true value will lie within that range. A narrow range that gives “an appearance of certainty is seductive when the world is desperate to know what lies ahead,” a criticism of the IHME model published in the Annals of Internal Medicine argued. But the numbers and precise curves the IHME is publishing “suggests greater precision than the model is able to offer.”

The criticism of the IHME model, and an emerging debate over epidemiology models more broadly, has brought to light important challenges in the fight against the coronavirus. Good planning requires good projections. We’ll need models to help predict resurgences and spot a potential second wave. Dissecting what the IHME model got wrong, what other models got right, and how the public and policymakers read these models is essential work if we want to create the best pandemic plans possible.

What’s wrong with the IHME model of the coronavirus?

Models of disease spread are meant to help decision-makers in an environment where math will outperform intuitions. Most people have a hard time thinking about exponential growth, which is why many were taken by surprise in March as the virus spread. That’s why models can be useful.

Models usually lay out some foundational assumptions and offer projections based on those assumptions. The IHME model seeks to project death rates and hospitalization rates assuming widespread social distancing and strong measures to prevent the spread of the virus. Since testing is so unreliable at identifying the infection rate, the model uses only death rates as data (though it’s worth noting that evidence from all-cause mortality data suggests we’re undercounting coronavirus-caused deaths, too).

That projected hospitalization rate was one of the things that set it apart, IHME researcher Ali Mokdad told me. It was one of the few models that offered that projection, and it was actionable information that governments could grab on to and plan around.

Another reason it rose in prominence was IHME’s decision to model the effects of strong social distancing measures. That choice proved correct — strong measures were indeed taken across the United States — and the lower death numbers the model churned out as a result reportedly led to it being received favorably by the Trump administration.

But as the weeks have passed, it has become clear that the IHME’s projections have been too optimistic, and slow to adjust to reflect the fact that deaths have plateaued rather than rapidly decreasing to zero. The IHME has been regularly updating its model as new data comes in, but the updates have often been slow enough that the numbers are absurd by the time they’re changed in an update. For example, in late April the model still stated that the expected total death toll was 60,000, even as the US was clearly only a few days from that milestone.

Mokdad told me when we talked that a fix was in the works, and it went up a few days later: The model now projects 73,433 deaths by August. That, too, is probably an underestimate — most other models project that total will be reached next week.

In the IHME’s defense, it does offer a 95 percent confidence interval that is more accurate than the topline numbers. That range goes from 59,343 — significantly less than the number of people already dead of the virus — to 114,228. That might sound like a wide range, but it’s still optimistic, and the actual toll is on track to lie outside that range entirely. MIT projects the US will surpass that upper bound in mid-June, and the Los Alamos forecast, like the MIT one, thinks the country will burst through the IHME’s 95 percent confidence interval in around six weeks.

That’s one of the core complaints about the IHME model: Its confidence intervals seem too narrow, both for its next-day and several-months-ahead predictions. For instance, when used to predict the number of deaths in a state the very next day (which doesn’t require any complex modeling of the long-term effects of uncertain policies), other researchers found that the true deaths were outside of the 95 percent confidence interval given by the model 70 percent of the time. Those are embarrassingly poor results.

The day-to-day numbers are jumpy because some counties don’t report numbers every day, Mokdad told me. “The model is kind of confused because states are not reporting deaths consistently,” he said, adding, “The model assumes deaths will increase and then come back down,” and so it reacts poorly when deaths instead vary day to day. Smoothing over the course of a week — the latest update the IHME team made — means the model should be more predictive.

Indeed, the latest update to the model solves the problem where the next-day deaths are usually outside the model’s confidence interval by giving the model extremely wide confidence intervals. That’s commendable. It is better to be honest about your extremely high uncertainty than to claim certainty you don’t have.

That might lead to unsatisfying models with extremely wide ranges — 176 to 3,885 deaths today — but if that’s an accurate reflection of the state of uncertainty, then so be it. It’s better to have a wide confidence interval, acknowledging your uncertainty, than to have a narrow one that is usually wrong. And if next-day deaths are usually wrong, as IHME’s are, that undermines confidence in the long-term predictive value of the model.

(But while the IHME has made its day-to-day confidence intervals larger, it still has very, very narrow confidence intervals for its projections several weeks into the future — which is odd, as we should be even more uncertain about those.)

An even bigger issue with the IHME model is that the way the model is published can obscure its problems. When it’s updated, it can be hard to see what its old, obviated predictions were. A website, covid-projections.com, has been set up so that you can look at the predictions made by old versions of the IHME model (and at the history of other models). The IHME models are frequently fairly far off.

The IHME model is unusual compared to other epidemiological models for its design, too. While most of the models use standard epidemiology modeling tactics like SEIR (susceptible-exposed-infected-recovered modeling) or use computer simulations, the IHME model is effectively just about fitting a curve from early data in China and Italy to the disease’s trajectory elsewhere.

The IHME model is based “on a statistical model with no epidemiologic basis,” the Annals of Internal Medicine critique argues.

The IHME team has defended its model against those complaints.

“We’re willing to make a forecast. Most academics want to hedge their bets and not be found to ever be wrong,” IHME director Christopher Murray told Politico. “That’s not useful for a planner — you can’t go to a hospital and say you might need 1,000 ventilators, or you might need 5,000.”

The model has gotten better, but flaws remain

The change to the model has certainly improved it, but some problems with the confidence intervals remain. The new model acknowledges extremely high uncertainty about what will happen tomorrow — in my state, California, it says there will be between 5 and 103 deaths tomorrow. That’s a really wide range.

But on May 20, the model is entirely sure there will be zero deaths. The 95 percent confidence interval runs from zero deaths to … zero deaths.

The IHME’s model of deaths in California reports lots of uncertainty about what will happen tomorrow but extremely high confidence about what will happen in late May.

That’s an extremely strong claim. I pressed the IHME team about whether they were sure of it. Mokdad said that the model’s zero-deaths predictions were correct: “Based on the graph, in certain states, yes — in California, May 17, zero. The virus is not circulating anymore; you would expect it to go to zero.”

But the virus is still circulating in California. On April 28, there were 1,187 new cases reported in the state. Even if all infections in California stopped instantly today, it is likely that some already infected people would die in mid-May or later.

California’s case numbers (like those of many states) are declining, but only slowly, as social distancing has limited the spread of the virus but not fully stopped it. That ties into another problem with the IHME model: It assumes that social distancing measures, once put in place, are always sufficient to rapidly decrease case numbers to zero.

In the report explaining the model, the researchers write that they look at four measures: “School closures, non-essential business closures including bars and restaurants, stay-at-home recommendations, and travel restrictions including public transport closures. Days with 1 measure were counted as 0.67 equivalents, days with 2 measures as 0.334 equivalents and with 3 or 4 measures as 0.”

In other words, the model has a built-in assumption that once three of those measures have been put into place, cases will rapidly fall to zero. No new data can change that assumption, which is why the model continues to project zero deaths by mid-May in any area that hasn’t lifted social distancing restrictions, even though case numbers have only plateaued rather than declined in many areas.

There are many similar frustratingly confident predictions: On July 12, the model says, we’ll need between 0 and 12 hospital beds in the whole US for coronavirus patients, and there will be 0 deaths at any point after July 12.

Why the IHME model’s problems shouldn’t be used as an indictment of epidemiology more broadly

All these concerns make it frustrating to many epidemiologists that the IHME model is the one being widely used and cited. “That the IHME model keeps changing is evidence of its lack of reliability as a predictive tool,” epidemiologist Ruth Etzioni told Stat News in an article about the problems with the model. “That it is being used for policy decisions and its results interpreted wrongly is a travesty unfolding before our eyes.”

Alex Merz, a microbiologist at the University of Washington’s School of Medicine, has written that@IHME_UW‘s overly optimistic modeling projections contributed to this debacle” of the coronavirus’s rapid spread in the US and condemned their “amazing shrinking error band that, preposterously, constricts to zero uncertainty in mid-June. That is not only bad science communication — it is bad science.”

The IHME model has also brought the discipline under fire from other fields. Tyler Cowen, whose widely read blog Marginal Revolution has linked many of the papers demonstrating how poorly the IHME model works, argued “now really is the time to be asking tough questions about epidemiology, and yes, epidemiologists.”

That’s not fair, many epidemiologists say — the model doesn’t use any of the standard tools of the discipline, and they hate it too.

“The IHME approach departs from classic epidemiological modeling,” epidemiological modelers at the University of Texas Austin argued. “Rather than using systems of equations to project the person-to-person transmission of the virus, the [IHME] model postulates that COVID-19 deaths will rise exponentially and then decline in a pattern that roughly resembles a bell curve (i.e., normal distribution).” (Their paper puts forward their own attempt at modeling the disease with a curve-fitting approach, but one tweaked to use better data and to better represent uncertainty.)

So it might not be fair to draw conclusions about the field as a whole from one model that mostly avoided using its standard tools.

Moreover, we should be aware that there are better models. The Imperial College model that the British government relied on to inform the country’s coronavirus strategy has held up reasonably well, with case numbers loosely tracking the model’s predictions for what would happen if social distancing were implemented (as it was shortly after the model was published). It, too, has been frequently revised in response to new data and has come under criticism for overconfidence, but the inaccuracies are smaller and less pervasive, and the initial numbers before any revisions weren’t that far off.

Other models, employing more standard epidemiological approaches, perform even better — though usually in narrower domains, like trying to project just the peak of the outbreak, or just the rate of new cases.

The IHME team says their model performed pretty well on key problems like predicting the peak in most states (it’ll be clear in coming weeks how true that is) and that they’re continuing to revise it to make it better. Meanwhile, as the debate over epidemiological models heats up, it’s fair to say that holding the shortcomings of the IHME model against epidemiology as a whole isn’t fair.

We don’t have all the answers

But that still leaves the question: Why did the IHME model become so popular?

The flaws I’ve noted above were also features that made the model appealing when it was first launched. It was optimistic, projecting lower deaths than other models. It was clear and precise, with narrow confidence intervals. It projected hospitalizations, which few others were doing — though those projections turned out to be wrong because we didn’t know enough to project hospitalizations well at that stage. In a time of uncertainty, the IHME model was compelling.

But it turns out the uncertainty being reflected in a lot of other, better models is showing up for a reason — there really is still a lot we don’t know about the course this disease will take.

As my colleague Matt Yglesias has written, “We need to value scientists and listen to experts, but part of listening means understanding that right now, what they’re saying is that they do not have all the answers.”

Given the enormous uncertainty we’re facing, responsible scientists are avoiding giving dramatic topline numbers that they’re unsure of, emphasizing the very wide confidence intervals on their estimates, and being careful not to publish results that the Trump administration or the public may interpret as definitive.

But people keep searching for definitive answers (understandably so!), and so any model that is presented more confidently will rise to prominence over models that are humbler and better reflect our confusion.

The IHME model is one of several released by researchers at the University of Washington, but it has been far more widely cited and discussed than the others — despite being less accurate.

Ultimately, the problem may not be that some models are inaccurate. It was predictable that some models would be inaccurate, with the situation as confusing as it is. There are dozens of models and ideally we’d be doing something like aggregating them, weighting each in our aggregation according to how well it has performed so far in predicting the crisis.

Our current process is almost exactly the opposite of that. Most models are nearly ignored, while a few are cited by the White House and widely referred to in the press. The process that makes some models prominent is not a process that picks out the best of the best — and in fact, it may actively be picking out worse models, by emphasizing ones with notable numbers and statistics to cite, by leaning into the seductive “appearance of certainty” that researchers have warned us about.

In other words, the fact that many epidemiological models are performing badly isn’t great, but it could be part of a productive process of arriving at better models. But the media, policymakers, and the public need to be conscious of what kind of models we gravitate to and why. If we elevate the ones we like and quote numbers from them as if they’re definitive, then the models will certainly end up shedding more heat than light.


Sign up for the Future Perfect newsletter and we’ll send you a roundup of ideas and solutions for tackling the world’s biggest challenges — and how to get better at doing good.


Support Vox’s explanatory journalism

Every day at Vox, we aim to answer your most important questions and provide you, and our audience around the world, with information that has the power to save lives. Our mission has never been more vital than it is in this moment: to empower you through understanding. Vox’s work is reaching more people than ever, but our distinctive brand of explanatory journalism takes resources — particularly during a pandemic and an economic downturn. Your financial contribution will not constitute a donation, but it will enable our staff to continue to offer free articles, videos, and podcasts at the quality and volume that this moment requires. Please consider making a contribution to Vox today.