Two Errors I’ve Made Interpreting COVID-19 Data, that Others are Also Making

I’ve made two errors, and I see others making the same errors, and I suspect many are not noticing these errors (because I see few comments pointing them out), so I hope this helps people make better sense of the numbers we’re seeing. The errors I made were:

  • Mixing up the case fatality rate (CFR) and the infection fatality rate (IFR)
  • Misunderstood a plot, and saw a drop-off in deaths, not realizing it was an artifact of a delay in reporting.

Case Fatality Rate and Infection Fatality Rate

The first error is confusing the CFR and IFR, and comparing them. This article, well worth reading, goes into estimating the IFR, and starts with a brief description of IFR:

A systematic review and meta-analysis of published research data on COVID-19 infection-fatality rates

Two quotes from the original post before it got pre-published.

“We can quite quickly get very good estimates of the case-fatality rate, which is the rate of death in people who have tested positive for coronavirus, but the one thing we are very sure of now is that we aren’t catching every case of the disease. It’s also unlikely that most places are capturing the true figure of people who have died from COVID-19, which means that both our denominator and numerator are suspect…

“All of this makes the infection-fatality rate very hard to know. This is the rate at which people die when they are infected with the disease, including all of those mild and/or asymptomatic cases that you may have heard about. Some people have prominently claimed that this number is likely to be similar to the rate of death due to influenza — about 0.1% — while others have said that it’s probably 10 or 20 times that.”

I made both these errors he described in a blog post: comparing CFR for the flu to an estimated IFR of COVID-19, and also making an estimated IFR at all, based on the USC study. (The error was corrected.)

Misinterpreting a Chart and Seeing Good News When It Doesn’t Yet Exist

I often visit the LA County COVID19 Dashboard to see the latest stats.

One day, I was looking at the daily deaths, and saw “0” deaths. So I posted about it to Facebook. I was so happy to see that number!

I missed the text below the chart: “Recent dates are incomplete due to due to lags in reporting. The gray box corresponds to dates that are likely to not yet be reported completely.

Look at that dropoff on May 7 through 10! Wow. Too bad it’s not real.

The gray background just shows a lack of data. The death and case counts are only low because the data hasn’t come in yet.

That Facebook post was pretty embarrassing.

As lame as I felt about that error, I don’t think I’ll feel quite as bad as this guy, who has a more impressive resume, and makes a bolder claim.

The original chart is at https://adamaltmejd.se/covid/.

The chart appears to show a steep dropoff in deaths. You need to read the fine print above: the gray bars on the right are the predicted deaths.

The colored bars show how long it took to report the deaths.

Compare the values in the legend to the colors in the bars. You’ll notice that the lags shown correspond to the days before the final day in the chart.

For example, the yellow bar, which shows a 3-4 day lag, doesn’t show up in the chart until 3 days before the final day. The light blue bar, which shows a 7-10 day lag, doesn’t show up until 7 days in the past.

Each color bar shows the total of deaths, and how long it took to know that these people died, broken down by the delay in reporting.

This chart is showing a lot of information in a single image. The only way to really grasp it is to spend some time to understand it, and it’s not that easy. That’s a pitfall, because everyone expects plots to make information easier to understand.

So, death rates in Sweden did not collapse. They just take time to get reported and recorded, which is probably totally normal. According to the gray bars, the deaths are holding steady.

Similarly, The EuroMOMO Maps have some delay

This problem of delayed data is a real thing. The first time I saw it was when Off-Guardian and some other “virus denial” posts were referencing the EuroMOMO maps. EuroMOMO tracks deaths across many countries in Europe, and shows when deaths in one week exceed historic averages.

Back in March, they had a map that showed COVID-19 wasn’t killing that many people.

I was told that it was evidence that COVID-19 wasn’t that fatal. While I didn’t believe that COVID-19 wasn’t that fatal, I did take the map seriously.

The problem was, it took a month or two to compile all the data and produce a map, so the February maps weren’t correct until April, and the March maps weren’t correct until some time in May. (The website blog said it takes up to 8 weeks to get all the data from the participating countries.)

The map that I was viewing was incomplete. So, I couldn’t really use it as evidence supporting or refuting any idea about COVID-19.

Take a look at this screenshot (edited) of today’s map. It looks like there’s few excess deaths!

Okay, here’s the problem – all the data isn’t in. The gray areas indicate “no data”. You can go to the map, and check it yourself. If you check it in one month, the image should be different, and have more dark blue areas.

This website appears to have been significantly updated, and there’s less delay indicated than in the past, so as long as you ignore recent dates, it’s good data.

As with the previous chart, we’re experiencing a mismatch of expectations. We expect maps to be up to date, especially when there’s a date displayed with the map.

There’s no warning under the date that reads: “Map Incomplete, Data Pending”.

They offer a play button and a slider that animates the map, giving us a sense of control over time, and of reviewing the past. This hides or obscures that there are delays in reporting, so the maps are not complete in the recent weeks.

The user experience gives me an illusion of information.

Conclusions

There’s a lot of data coming at us.

It’s not easy to understand, but you can learn to understand it.

Don’t expect all data to be accurate for the past week, at least. It takes time and effort to collect data, process it, and report on it.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *