## Sunday, January 20, 2013

### Statistics, laymen, and shuffling cards

On Sunday, we usually spend hours by playing Canasta, a card game, in a broader family.

Most typically, there are 2 teams with 2 players each (the composition of the team may change during the tournament which consists of something like 25 games per afternoon).

Sometimes men win but it seems that more often, women win, and so on. I don't want to go into that. ;-) Instead, I want to discuss an unlikely event and people's reactions to it.

A week ago or so, one of the players – whom I will refer to as HM – was the dealer who deals out 14 cards out of 108 Canasta cards to each of the 4 players. During the afternoon, there were two interesting games.
• In one of them, one player got 7 jokers-or-deuces
• In the other, another player got 8 jokers-or-deuces
I said those games were too unlikely – we therefore had a strong statistical evidence that HM wasn't shuffling the cards properly and randomly. It seemed like common sense to me; I will go beyond common sense momentarily. Of course, your humble correspondent was surprised what a scary can of worms he opened! ;-)

Needless to say, I had to be nearly silent about this topic for hours. The remaining three players teamed up to laugh at me and claim that I was obviously a nutcase, that there could be no pattern and even if there were a pattern at some point, it will surely get lost, and so on, so the unlikely events above must be due to pure chance.

It's not hard to calculate the odds exactly – a particular combinatorial number – but it's faster to write a short Mathematica code and run a simulation. First, you should know the problem. There are 108 cards in the deck, 4 of them are "jokers" ("big jokers" in our terminology) and 8 of them are "2" or deuces ("small jokers" in our jargon). Big and small jokers are "wild cards" and comparably helpful to complete canastas. So in total, there are 12 wild cards in the deck of 108 cards. What is the probability that a random subset of 14 cards among the 108 cards contains at least 7 wild cards? And what about the odds for at least 8 wild cards?

Here's a quite Mathematica code one may write down in a minute:
attempts = 1000000;
Dynamic[{j, successes}]
successes = 0;
For[j = 1, j <= attempts, j++,
a = Table[RandomInteger[107], {i, 1, 14}];
While[Length[Union[a]] < 14,
a = Table[RandomInteger[107], {i, 1, 14}];];
zuliku = 0;
For[k = 1, k <= 14, k++,
zuliku = zuliku + If[a[[k]] <= 11, 1, 0];
];
successes = successes + If[zuliku >= 7, 1, 0];
];
Print[successes/attempts];
Sorry, "zuliku" stands for "number of wild cards". Now, a player gets at least 7 wild cards in 1 in 14,000 attempts; he or she gets at least 8 wild cards in 1 in 300,000 games or so. Please free to calculate the exact numbers (or see the comments for the exact formula).

During the afternoon, there are 25 games and each of the 4 players may receive the "surprisingly high" number of jokes, so there are effectively 100 opportunities per afternoon. The probability that someone gets at least 7 jokers sometime in the afternoon is 1 in 140 or so. The probability that someone gets at least 8 jokers sometime in the afternoon is 1 in 3,000. If you demand both of these things to occur in different games in the same afternoon, you must pretty much multiply the odds. So I think that the chances of the 7+8 miracle is 1 in 400,000 per afternoon or so.

Even before I tell you that the 7 or 8 jokers for a single player occurred 3 times in recent weeks and not just twice, this is an extremely low probability, close to a 4.5-sigma certainty that it couldn't have occurred by chance. (It seems to me that I am very close to 5 sigma if I refine the hypothesis so that only the 25 games in which HM is the dealer count.) Even if you account for the look-elsewhere effect by acknowledging that the game has been played on hundreds of afternoons, the chances are still low and they suggest that the null hypothesis – pure chance – is contrived, to say the least.

Incidentally, I think that I also know the mechanism by which HM ends up with a higher success rate when it comes to her or his ability to deal out 7 or 8 jokers. She or he shuffles the deck of cards by combining two half-decks in such a way that (ideally) every even card is from the left half-deck and every odd card is from the right half-deck. They alternate. If your left half-deck has a large concentrations of jokers, there will be many jokers among the cards with even (or odd) IDs. If she or he shuffles the cards by this method once again, the concentration of jokers gets diluted by another factor of two and when the shuffling is done regularly enough, which is not insanely unlikely, the ordering of the (many) jokers will be equal modulo 4. Roughly speaking, every fourth card will be a joker. When HM deals out the cards to the players, one by one, and that's how she or he is doing it (just like myself), it's obvious that the same player may get all the jokers.

Feel free to discuss the technical issues, pros and cons, of these arguments and mechanisms.

But my main surprise is the broader sociological question: Why do the other players feel so certain that they think that they don't even have to do any calculation – they don't have to know any maths beyond the fourth grade of the elementary school, to be more precise – to generate a sensible judgment on such an obviously subtle, quantitative, statistical question? They must be clearly convinced that mathematics doesn't operate anywhere in the world. They must clearly be convinced that mathematics can't possibly help you to judge such questions. Cards must be perfectly random from God or another supernatural or divine agent even if you don't shuffle them. ;-)

Why? How does the belief work? How does the world in their picture work? I have lots and lots of empathy – perhaps even more empathy than Sheldon Cooper ;-) – but I just can't understand it. As a person who sees equations behind everything and who feels very humble if not detached whenever I can't solve or even estimate the relevant maths in a situation, I simply can't understand where the skyscraping certainty of people who have almost no clue and who haven't even tried to think about the issue quantitatively comes from.

1. I think probably because at some time in their card playing career they got an unexpected hand , truly by chance.

Back in 1958 at a US college dorm I was convinced to make a fourth for bridge. It was the first time I had the game explained to me . I got a hand with high cards all in clubs and bid and won a grand slam . My partner across had all the rest of the clubs except one !

It convinces one of luck.

Of course it could also be bad shuffling, not on conscious purpose.

2. Right. One may get used to certain events' being rather frequent and miss the fact that the elevated frequency is and has to be due to something else than pure chance. In other words, they're just assuming everything is due to pure chance and they measure the frequency of rare events and interpret them as the frequency implied by pure chance even though the actual frequency predicted by pure chance would be much lower.

Of course that I am not saying that the patterns in the shuffling by HM were "deliberate" or even "malicious". Quite on the contrary, I am saying that they went completely unnoticed but they were still responsible for increasing chance of the unlikely events by several orders of magnitude.

I don't quite have the background to understand the card problem you described. But again, if you found that the probability was 1 in 3 million or so, i.e. five sigma, I would still be willing to say that it wasn't quite due to chance.

Unlikely events have to occur sometimes. That's one of the "laws of statistics". However, they probably occur only if the probability of the rare event increased by the look-elsewhere correction is still sufficiently reasonable, sufficiently comparable to one. It shouldn't be tiny.

3. I think that some people just like to have the feeling that they are "special" and the more rare the event the more special they are. Also, they do not want to show their mathematical ignorance so would rather just insist that it is random. And if you are in a minority because you are the only smart guy then they will gang up on you 'cause they don't really care and you do.

4. The exact expression for the probability of getting N or more wild cards can be written:

p(n>=N) = (sum from k=N to J of C(H, k) C(D-H, J-k)) / C(D, J)

where D = 108 is the number of cards in the deck, H = 14 is the number of cards in a hand, J = 12 is the number of "jokers" in the deck, and C(n,k) is a notation for the binomial coefficient (i.e., "n choose k").

The results of your Monte-Carlo-style calculation are not bad. They are within an order of magnitude of the exact numbers. The probability of at least 7 wild cards is about 1 in 14097. The probability of at least 8 wild cards is about 1 in 295706.

5. Hi Lubos, while I found your conjecture of the mechanism of the dealer's propensity for dealing out 7 or 8 jokers to individual players plausible, I am puzzled by the reasoning in the last two paragraphs of your reply to anna.

Unlikely events occur all the time, not just sometimes. Every single day, we consciously experience many thousands of separable events, and nearly all of them go down the memory hole soon afterward. It is only the unusual events that attract more than our usual attention, and only the extremely rare events get remembered long-term. This is true especially if they are associated in our mind with a particularly incisive moment in our lives.

But in the course of an ordinary person's life, there will be many one-in-3 million events purely due to chance, and some people who are particularly lucky (or unlucky) will experience multiple events that are even more unlikely singly or cumulatively. This guy got struck by lightning seven times; This lady has won the lottery jackpot four times.

6. Just for the sake of speed, this should make mathematica about 20 times faster:

attempts = 1000000;
successes =
Total[Table[If[Length[Select[RandomSample[Range[0, 107], 14], # <= 11 &]] >= 7, 1, 0], {i, attempts}]];
successes/attempts // N

7. This is clearly a mistake somewhere in the case of 8 cards (probably in my program) - the accuracy of the Monte Carlo calculation should be about 10%, not an order of magnitude.

8. Thanks, maybe, Fred! If one believes he or she is special, very unlikely things just can occur without any need to challenge the null hypothesis.

9. Cool, it's also much shorter.

10. Dear Eugene, unlikely events don't occur all the time. They appear - assuming randomness - more or less randomly, with the average frequency calculable theoretically and verifiable by frequentist observations of very large ensembles.

If the frequency of a particular rare event is low, it's very likely that it will not occur on a particular afternoon, especially not twice.

If one can show that certain events, even with the look-elsewhere effect, had a chance smaller than 1 in 3 million, particle physicists - who demand the strongest evidence among all the scientists, in fact, all the people - may declare that the results aren't random and they have proved the existence of a new non-random effect. They have falsified the null hypothesis.

If you put the numbers, none of the examples you mention is really unlikely if you incorporate the look-elsewhere effect i.e. if you calculate the odds that the rare event could have occurred anywhere or at anytime. However, having 7 wild cards in one game and 8 wild cards on another game with a fixed person+3 people around - me - simply *is* unlikely so that one may claim with an extremely high confidence that it's not due to chance.

11. Dear Lubos, thank you for taking the time to write out such a detailed answer! I will study it carefully and, if I still disagree with you, get back to you with a reasoned argument... only I know that in Lubosland, this means "quantitative" and I will have to write down numbers (shiver) and possibly even an equation or two (oooh, shaking).

12. Thanks, Eugene.

I've finally interacted about this topic with the person who often deals out the large number of wild cards to one player - one sentence beyond pure mindless humiliation of myself. ;-)

I was told that it has to be accident because he or she would otherwise have to do it deliberately and he or she knows that it's not the case.

However, this implication is logically invalid. People are doing things and creating regularities even in many cases when they don't realize what's going on. The logical fallacy in the previous paragraph is actually the same one that helps the same person ;-) to sustain his or her belief in supernatural phenomena. Whenever I - or anyone else - offers a rational explanation of these things, it's viewed as an assault on the morality of everyone who is around the supernatural phenomena, from the "performers" to the believers".

But this ain't the case. When one explains why certain seemingly very unlikely things occur by a theory that makes them reasonably likely, it is something else than a moral judgment. It's just a set of rational ideas. Many people may be misled into seeing "shocking things" where there are none without their having any bad intent whatsoever. In this way, rational explanations of unlikely events are taboo from the beginning - because they're blasphemous accusations against the integrity of everyone even though the true reason is often about ignorance and not integrity.

13. Lubos - Nowadays a very large number of people have heard that a(n apparently) meanlngless event has the same probability as a(an apparently) meaningful one; e.g., every permutation of the 52 cards is just as unlikely as the permutation in which they are in order. That's false, of course, since the number of permutations that are known to the observer as meaningful is many orders of magnitude smaller than the number of permutations. But many decades ago, not so many people had heard the erroneous claim - or any non-erroneous one. Meaningful, highly improbable events were more often taken at "face value" and exclaimed over. I remember this from as far back as the middle 1950s. So I would suspect your canasta friends of basing their idea in part on the new, "sophisticated" misconception.

14. Unlikely events occur all the time, not just sometimes. Every
single day, we consciously experience many thousands of separable
events, and nearly all of them go down the memory hole soon afterward.
It is only the unusual events that attract more than our usual
attention, and only a very small percentage get remembered long-term.

Eugene - Meaningful unlikely events do not occur "all the time."

15. According to Perci Diaconis, seven riffle shuffles is both necessary and sufficient to approximately randomize a deck of 52 cards.

http://www-stat.stanford.edu/~cgates/PERSI/papers/Riffle.pdf

16. Yes, Hypergeometric Distribution.
http://en.wikipedia.org/wiki/Hypergeometric_distribution
N=108, n=14, K=12, k=7, 8

A joke:
http://www.smbc-comics.com/index.php?db=comics&id=2861#comic

17. Wow, that's quite hard work. ;-)

18. Yes, hypergeometric distribution:
http://en.wikipedia.org/wiki/Hypergeometric_distribution
with N=108, n=14, K=12, k=7,8

Speaking of jokers, a joke:
http://www.smbc-comics.com/index.php?db=comics&id=2861#comic
Note the unlikely student black girl, Lubos would say:)

19. LOL, a funny joke, and so true.

20. I think you have your answer. Most people see it from the point of sociology/pshycology where someone losing a game often complains about extraordinary bad luck. Usually this is because they fail to understand the look elsewhere effect. In online poker, such people are called "rigtards" because they often claim the games are rigged to their disadvantage.

A claim that the bad luck suffered by a player is too unlikely to happen by chance is generally a veiled accusation of cheating. In your example, the players have shuffled cards in the same way for years and have never thought that the shuffle would be insufficiently random. The idea that something could happen not by chance, yet not on purpose didn't cross their minds.

21. Yes, hypergeometric.

I just realized that I screwed up when typing in the expression above. The numbers are correct, but the H's should be J's and vice versa. I guess it didn't help that H and J are next to each other on the keyboard.

You haven't lived until you've had Mathematica do your programming for you. I once used code generated by Mathematica that was so convoluted that turning on any optimization in the compiler would break the program (i.e., give the wrong answer).

22. Absent the ability to estimate probabilities by statistical methods, your fellow players are bound to judge likelihood solely based on past experience. Hence, they may have already gotten used to statistical flukes due to systematic errors (i.e. improper shuffling). No conspiracy is needed.

23. You are right, of course. I guess people who cannot estimate probabilities are unlikely to detect tautological reasoning, too.
Re conspiracy: I was referring to Fred's comment that if "you are the only smart guy then they will gang up on you". Should have made that clear.

24. I am not surprised by the reaction of the other players. Once, when I was an undergraduate student at UC Berkeley, I had a similar experience. In a four-handed game of bridge, one player left for a bathroom break. While he was absent, we not only did not shuffle the deck; we stacked it in a precise order so that he would get a perfect hand. When he returned he got extremely excited and wouldn’t even believe it when we told him that we had stacked the deck.

25. LOL.

I said I could arrange it in a similar way so that someone gets at least 8 wild cards. Of course I couldn't guarantee who because the pack of cards is cut before the cards are dealt out.

The assumption was that the cut is done normally between 1/3 and 2/3. Of course, HM deliberately cut the pack at the third card from the top or so - not sure whether it's even legit - so there were no wild cards in the part of the pack I was distributing.

HM thought it proves that my theory that the increased chance of wild cards was due to shuffling with patterns was therefore ruled out. It's a similar way as ruling out the claim "nuclear power plants work" by hitting one with an airplane.

26. Playing game is one og the my favourite hobby. In holiday I like to play cards, lottery games and betting on sports games. Due to this we got lots of enjoyment as well as earn some money. Visit http://eurojackpotgewinnen.com/eurojackpot/erlebe-die-wm2014-wie-ein-weltmeister/ website for more details abut gambling.