Tuesday, January 05, 2021 ... Deutsch/Español/Related posts from blogosphere

Belorusian Covid death data are self-evidently rigged

They may be still doing great with Covid but the data can't be right in their entirety

According to the tables and maps, Lukashenko's Belarus – which has wisely avoided Covid restrictions, both at the level of government orders and the personal behavior of most people – only has 15 Covid deaths per 100,000 people now while e.g. Czechia has 115 deaths per 100,000 people (about 7.7 times higher than Belarus).

It would be a great argument for the Belorusian approach. Too bad, I don't really believe their data now. Look at their national statistics page and find the chart of the daily deaths. Like other Belorusian charts, it's unusually smooth and stable. Can these daily numbers occur by chance? The answer is a resounding No.

Look at the chart:

Well, even by eyeballing, it's unnaturally smooth. It seems that the creator of this dataset has believed that for long months, the daily numbers may be changing \(\pm 1\) on every single day and it's enough of a variability for his dataset to be natural. ;-) I will show you the calculation that my intuition was damn correct and this tiny variability is impossible in practice.

The chart looks almost piecewise constant in many intervals – and each such interval could be used as a source of an additional argument of the type that you will see (additional arguments would arise from the chart of the number of cases). But it's enough to pick the recent deaths per day between November 15th, 2020 and January 4th, 2021. What's remarkable is that in each of these 52 days, the number of daily deaths was between 7 and 10. If you have worked with similar statistical data, you must know that it's just impossible. What is the probability?

Well, first, all (apparently?) legitimate national data show vastly greater variations, even "waves" on the graphs. You can't really sustain that "average expected number of deaths per day" to be this stable for 52 days. But let's assume that by some stabilization, this was achieved. However, what can't be achieved is the elimination of the statistical noise.

What can we assume about the underlying dynamics to get as uniform daily death data as possible? We must assume that it is a Poisson point process (one chooses a number of deaths and their moments are randomly, uniformly distributed over the interval of time, independently of all the other deaths) which leads to the Poisson distribution for the number of deaths per day. Because we want to fit the possible numbers of daily deaths 7,8,9,10 into our model, we will assume that we deal with a Poisson process with\[ \lambda = \frac{7+8+9+10}{4} = 8.5 \] which is the average number of deaths per day. With this underlying process, the probability that for a given random day, the number of deaths is actually \(k\in\ZZ\) is equal to\[ \frac{\lambda^k e^{-\lambda}}{k!}, \] the probability mass function. Here are the probabilities for \(k=0\) up to \(k=15\). The Wolfram Mathematica command and the output is

lambda = 8.5; a = Table[lambda^k*Exp[-lambda]/Factorial[k], {k, 0, 15}]
{0.000203468, 0.00172948, 0.00735029, 0.0208258, 0.0442549, 0.0752333, 0.106581, 0.129419, 0.137508, 0.129869, 0.110388, 0.0853001, 0.0604209, 0.039506, 0.0239858, 0.0135919}
You see that it peaks for \(k=7\) and \(k=8\) at 12.94% and 13.75%, respectively. What is the probability that in this process, the number of deaths on a chosen random day will be 7-10? It is the sum of four numbers
Total[a[[7 + 1 ;; 10 + 1]]]
The probability is 0.507, just a bit over fifty percent! ;-) Now, what is the probability that each of the 52 recent days will have the number \(k\) between 7-10? It is simply 0.507 to the 52th power:\[ P_{7,8,9,10} = 0.507184^{52} \approx 4.66 \times 10^{-16}. \] The probability is less than (one-half times) ten to minus fifteenth power, less than half a quadrillionth. It can't occur by chance, especially because I could add additional factors from the stability in other intervals. It's not hard to see why the probability ended up being this tiny, basically zero: the interval 7-10 is close to the interval "the mean plus minus half a sigma": the one-sigma, 68% interval is between \(8.5-2.9\) and \(8.5 + 2.9\) where \(2.9\) was calculated as the square root of \(8.5\). That one-sigma interval is roughly 6-11. It would be impossible to keep the daily deaths even in the interval 6-11 for 52 long days (\(P\approx 0.7^{52} \approx 8.8 \times 10^{-9} \)). Keeping them in the narrower interval 7-10 is totally impossible, as the tiny probability above shows.

So the daily death numbers were invented by someone who thinks that equality and uniformity lasting for 52 days is normal and natural – because he or she is illiterate in statistics. It doesn't prove that the overall number of deaths is completely wrong (and probably understated by a factor of 5-20) but it is strong evidence. It is hard to imagine how and why they would publish "streamlined", implausible, and clearly fake daily numbers with a nearly accurate overall sum. It's more likely that they invented the wrong total or the "desired" approximate number of deaths per every day in a month (a big underestimate, we may speculate because of other reasons) and after that they invented the detailed daily numbers for every day.

Cheating is wrong, being illiterate in statistics is also wrong, but Belarus' approach to live with Covid is still fundamentally correct.

P.S. 1: It is really true that the daily deaths in the Belorusian chart never differ from the previous day by more than 1. So the daily increment has been \(-1,0,+1\) at least since April. When the typical numbers are 7-10 per day, the probability that the change belongs to this set with 3 elements is at most 3x 13% or so, around 40%. So I can actually improve the probability of this miracle at least to \(0.4^{52} \approx 2\times 10^{-21}\). Add comparable factors from the periods before November. We could probably get to \(10^{-100}\) if we organized the calculation cleverly.

P.S. 2: Here is a funny factoid about 2021, the number indicating the new year we just started to inhabit:

2021 is the year that has the two promising traits that we should use to fix the world and clean the world from the neo-Marxist and related toxic cr@p. If we fail to do it in 2021 AD, the next opportunity could be in 23073409469011482307340946901147 AD. And in that future year, things could be harder because in the concatenation, we start with the larger number and it is followed by the smaller one, not like now. This evil twin of 2021 is taken from this page, I have absolutely no idea how someone could have found such a huge number (five trillion, everything squared) with this double trait. Does someone know the magic? Maybe testing five trillion candidates is doable in realistic times... OK, it is.

Add to del.icio.us Digg this Add to reddit

snail feedback (0) :

(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) })(window,document,'script','//www.google-analytics.com/analytics.js','ga'); ga('create', 'UA-1828728-1', 'auto'); ga('send', 'pageview');