## Monday, March 27, 2006

This is a good example how people who professionally deal with statistics can disagree about the results if the problem is not accurately defined.

The Monty Hall problem is the following. There are three doors 1,2,3, goats behind two of them, and a car behind the third. You want to win the car. The procedure has several steps.
1. First, you choose one of the doors. Without a loss of generality, because you don't know anything yet, you choose the door number 1 (they don't know that this is your convention)
2. Second, Monty Hall chooses and opens another door, say the door number 3 (the subtlety occurs at this point!), and you see there was a goat behind the door number 3
3. Third, you can either choose to keep the door number one or switch to the door number 2, trying to maximize the probability that you win a car

In fact, you can play it here. This seems as a trivial task but actually the rules above are not yet well-defined as we will see. What is the probability that you get a car if you keep the door number one?

Symmetric argument

At the beginning, there are three equally likely possibilities with likelihoods of 1/3:

• 1/3: car behind (1)
• 1/3: car behind (2)
• 1/3: car behind (3)

Once you see a goat behind (3), it is obvious that the third possibility could not have been realized. So only the first two possibilities survive. Once the third possibility is eliminated, you should work with the conditional probabilities, i.e. you should replace P(Answer) by

Given the condition that the car is not behind (3), which is what we have just learned, we know that the truth must be one of the first two possibilities. Because their total a priori probability was 2/3 (the denominator of the formula above) and each of them had 1/3 a priori (the numerator), the new predicted probabilities of victory are 1/2 for each door 1 or 2, and it therefore does not matter whether you change your door or not. Because no one has seen anything behind the door 1, it is completely irrelevant whether you were standing in front of 1 or not.

The winning probability is 50 percent whether you decide to switch or not. It is important to notice that in this scenario, we were assuming that Monty Hall had a rule to open the door number 3 literally: he always opens the door number 3 if we open the door number 1, which means that sometimes he will open a door number 3 with a car behind it. That never happened on TV.

Asymmetric situation

However, the script where you can play the game above leads to a different outcome than 50:50. Those who switch have the probability of 2/3 to win while those who keep the door number one only have 1/3. How is it possible? It is because in this game on the internet, there is a different mechanism how Monty Hall chose the door. In fact, he did not choose the door number 3 as a rule. Instead, he followed a different rule. He chose a door that had a goat behind it - and at least one door among the two has a goat behind it, so he can always do so.

This changes the numbers completely. The three equally likely initial situations, after you choose to occupy the door number 1, lead to the following events:

• 1/3: car behind (1) – they will randomly open 2 or 3
• 1/3: car behind (2) – they will open 3
• 1/3: car behind (3) – they will open 2

What are the probabilities that you win a car if you keep the door number 1? There are no conditional probabilities here because all three situations allow the game to continue in the standard way. If you keep the door number 1, you will only win in the first case, in which the car is behind the door number 1, i.e. the probability is 1/3. If you choose to switch to the last door, the door we called "2" but it can also be "3" if Monty Hall opens 2, you will win a car with probability of 2/3 because in the situations 2 and 3 (two thirds of the situations), when the car is not behind 1, switching is guaranteed to give you a car. So you should better switch.

Inverse asymmetric situation

We can also define a mechanism in which it is strictly better to keep the door. Imagine that Monty Hall tries to open the door with the car whenever he can. Obviously, if the car is found behind the door number 2 or 3, you know that it can't be behind 3 or 2, and you have lost anyway (unless they allow you to take the car revealed by the host). On the other hand, if Monty Hall opens a door with a goat, you know for sure that it is because there was no car behind the doors 2,3, which is why you keep the door number 1. It gives you a 100% probability of a car. If you switched, you would have a 0% probability of winning a car.

Conclusions

As you see, you need to know the rules how Monty Hall works. You need to know his internal mechanisms that are used to decide which door he opens. If you don't know the internal mechanisms of the things you study, not even approximately, you can't predict any probabilities. There is no God-given algorithm how the host works. In fact, there is even no God-given probability distribution on the space of algorithms that Monty Hall follows. This simple fact is what the fans of the Bayesian inference misunderstand.

The internal mechanisms of Monty can be probabilistic; in fact, if the car were behind the door number 1 in the asymmetric (internet) version of the problem, Monty had to choose 2 or 3 randomly. The mechanisms don't have to have any deterministic justification - which is what the advocates of the Bohmian mechanics misunderstand. Probabilistic mechanisms are equally fine and quantum mechanics works in this way. But still, we must know the mechanisms in order to make any scientifically reliable predictions.

If we don't know whether the mechanisms inside Monty Hall prevent (or encourage) him to open a door with a car, we can't say whether switching gives us an advantage. If we don't know what is the natural noise and the structure of autocorrelations of the global temperatures, we can't tell whether the current temperature data show human influence or not. And until we know what are the natural percentages of men and women in various professions, we can't determine whether these percentages are influenced by a particular social effect.