April 14, 2020

Covid-19 and data


Now we are in the middle of a pandemic, the news is full of statistics about new cases, hospitalizations, recoveries, and – tragically – also number of deceased. This results in impressive comparisons, visualizations, and predictions.

Country comparisons

It is always good to keep in mind that, as is always the risk with data, the numbers can be misleading. Recently, people have been comparing the current Covid-19 statistics of the Netherlands with those of Italy, and concluded that the Dutch were heading in the same direction as hard-hit Italy. These conclusions were based on the left table in the image above, and there it truly looks like the Netherlands is just a few weeks behind Italy. 

However, the left table is based on the progress on number of deaths. You can also compare the statistics of both countries based on the date of first registered death, as shown in the middle table. Then, the status of the Netherlands looks a lot more optimistic, as the number of deaths increases less than in Italy.

An even more optimistic comparison results from looking at the first 100 registered cases, as shown in the right table. But which of the comparisons is correct then?

Actually, non of these comparisons is correct, as there are too many differences between the countries. Take for example the way of registering the number of cases, the Netherlands is not testing as much as Italy. Or take the enforced policies, in Italy it is less common to discuss death with patients.

This shows that for comparisons like the context and other factors need to be taken into account as well, and that how similar the patterns may seem, they cannot always be easily compared.

Here is the original source from the AD (in Dutch).

Visualizations

Governments, institutions, and other parties more and more use visualizations and videos to explain and inform the crowd. An interesting video about the effect of social distancing can be found here. The video shows how connected the world is, and thus how easy the virus is spread along (in this case) America.

DIY analyses

Due to the pandemic, a lot of new data is generated. You can trust on the plots and the visualizations online, but you can also create them yourself. For everyone with interest in Tableau, here you can find everything you need to create your own analyses. You can also download this dataset and utilize it in every other visualization tool, like PowerBI or QlikView.