Charting COVID Cases

One thing that has been frustrating me with the pandemic is that the way that data are being presented to the public is often confusing and incomplete. This seems like exaclty the time to develop better science communication and graphics designs skills but results seem to vary by county/state/etc…

Our testing capacity has been…questionable at best for a lot of the pandemic. This hasn’t helped. It’s hard to compare statistics from today to three months ago knowing that our ability to capture the whole picture of what’s happening through increased testing, awareness, conflation of different types of tests, etc has varied wildly through time.

One thing I do think would help (especially for those of us who are not statistical wizards or inclined to dig through the data ourselves) is to stop using cumulative case count charts as a main source of information for the general public and replace them with % positive test charts. Reasons for this:

  1. Cumulative cases are always going to go up. The most we can hope for is a flat line, but outside of showing how well we’re “flattening the curve” this isn’t an incredibly useful way to see at a glance if conditions are getting worse in your area or not, especially when formatting makes it difficult to really interpret the slope of the line.
  2. Cumulative cases does not take into account testing capacity. If you test more people in an area with active infections, you’re going to find more positive cases. This makes the cumulative case line spike up, even though the actual infection rate might have stayed the same.
  3. Graphing the % positive tests is a (somewhat crude) way to take testing capacity into account by merely showing how likely it is that a COVID test on a given day comes back positive. This isn’t quite as useful when the only people getting tested are people who are almost certainly sick (as happened early on when people were being screened for symptoms before testing), but as testing capacity increases, allowing a wider range of people to get tested (and ideally, as we do broader surveys of a community to seek out infections before they result in bigger local outbreaks), this becomes more representative of the community at large.

I like the way testing results are being reported in the UW Virology Twitter account (here) – you see the number of tests, number of positives, a percentage positive is reported, and you can get cumulative total positives all in one page.

Ultimately, though, however the data are presented, there will be issues. The important thing is to realize what is actually being conveyed and what information is missing from the picture. Personally I think that any time a graph of coronavirus cases is being presented in the news it should be accompanied by a graph of testing totals at minimum, if only to make local ability to handle new cases more transparent. We should be able to see if our area is ramping up testing or is still overwhelmed, because a greater ability to detect new cases means we have more reliable information to help navigate our personal choices to reduce risk. If my area is unable to test, how do I know that there isn’t a growing outbreak in my area making it unsafe to work, run errands, or walk around the park?

We can still do all the personal preventative measures, like wearing masks in public, avoiding large gatherings, and distancing as much as possible in indoor spaces, but as more places are starting to open up, it seems vital to have complete and easy to understand information available to everyone.