## Friday, June 14, 2019 ... //

### Bell's inequality is straightforward, the quantum world is local

"Quantum Strangeness: Wrestling..." by George Greenstein is another deeply flawed book on the foundations of quantum mechanics that was recently published. It talks about Bell's experiment all the time – which is lame because that experiment is neither very interesting nor very special or pedagogically useful.

The text of the book seems very unprofessional. For example, the sentence "It was many years ago that I first encountered the Great Predictor" appears at the beginning of three chapters, 1,9,16. At least, the author suggests that he decided to be happy about the Copenhagen Interpretation but doesn't make it very clear why it's the right answer and I wouldn't say that his description what the Copenhagen Interpretation really says is quite accurate. So much of the book is communicating some subjective religious emotions of the writer.

I find it totally bizarre how many people are so confused about such an elementary exercise which is included in both graduate and undergraduate courses of quantum mechanics – and the lecture about Bell's inequality is in no way among the hardest ones.

OK, Bell's inequality is just one example showing that quantum mechanics differs from classical physics. This statement should be totally unsurprising – just like the statement that relativistic physics differs from non-relativistic physics – and indeed, almost every physical situation is predicted to end up differently by classical and quantum physics. Already in the mid 1920s, people understood very well that quantum mechanics fundamentally differed from classical physics.

By classical physics, I mean a description which assumes that the physical system is in one state at each moment (even before or without any observation) and the states are taken from a fixed set, "the phase space", and the points of the phase space are mutually exclusive with each other. When the probabilistic description is taken, there is a probability distribution on the phase space. Every Yes/No question about the state of the system may be defined by a subset of the phase space where the answer is Yes. The probability that the answer is Yes is simply the integral of the probability distribution over the Yes subset (or it is simply the measure of the subset, if you wish).

Quantum mechanics is a framework – or a class of theories – where possible states are normalized vectors in the Hilbert space, a complex vector space. Two (pure) states are mutually exclusive with one another if they're orthogonal to one another. The general probabilistic knowledge is given by mixed states i.e. density matrices – Hermitian matrices on the Hilbert space with the trace equal to one and without negative eigenvalues. Evolution is given by unitary transformations, all real-valued observables are defined by a Hermitian linear operator, and the probability of an outcome – which has to be one of the eigenvalues – is given by Born's rule, as the squared absolute value of the complex probability amplitude. The observer uses quantum mechanics to predict the probability of such an outcome by Born's rule. When he actually makes one of the measurements, he updates the state of the knowledge by replacing $\ket\psi$ with $P\ket\psi$ where $P$ is the projection operator on the subspace with the eigenvalue that was just measured. This update – also known as "the collapse/reduction of the wave function" in pop science texts – plays the analogous role as the update of probabilities in Bayesian inference.

That's it. The previous two paragraphs make it very clear that the two frameworks are completely different so there is indeed no reason to think that they should give the same prediction for... pretty much anything. And indeed, you may see that they strictly contradict each other, even if you allow the classical theory to be "anything" or at least "anything local". This contradiction is no more surprising than the fact that in non-relativistic physics, you may accelerate a rocket to any speed while in relativity, it's impossible to surpass the speed of light. Disagreement. Trivial. End of story.

Two spins: Bell's setup

In quantum mechanics and in the real world as seen through experiments, it is possible to prepare two perfectly anticorrelated spins. A fermion carries the internal angular momentum, the spin, and you may only measure whether it's "up" or "down" (correspondingly to clockwise and counter-clockwise rotation, at least as far as the conservation of the angular momentum goes) relatively to an axis you may choose. If you have two fermions and you choose the same axis, the binary outcomes may be guaranteed to be perfectly anticorrelated regardless of the choice of the axis, assuming you choose the same axis for both.

This perfect anticorrelation is guaranteed by the initial state$\ket\psi = \frac{\ket\uparrow\ket\downarrow - \ket\downarrow\ket\uparrow}{\sqrt{2}},$ the so-called Bell's state. "Bell's state" is just a stupid name preferred by the people who are confused by quantum mechanics. Actual physicists call it "the singlet state" because in the rules about the addition of angular momenta, this state corresponds to the $J=0$ state obtained by adding $J=1/2$ with another $J=1/2$ (when we add two of these "doublets", we obtain one $J=0$ "singlet" and one $J=1$ "triplet"). The state is completely determined, up to an overall complex normalization (especially including a phase), by its being a singlet. All the physically meaningful properties of spin and these states were settled by Wolfgang Pauli and perhaps Eugene Wigner around 1926-1927, with some pre-existing help by Clebsch and Gordan and others. Pauli should be credited, it's just ludicrous to praise Bell for the singlet state.

Now, why are we talking about two fermions with spins? Because we want some kind of a minimum system where the correlations may be calculated and where classical and quantum mechanics may be seen to differ: the spin of a spin-1/2 fermion carries two possible values, a quantum bit, so it's the minimal choice of a system whose measurements are non-trivial (one state would be bad because the result is unique and therefore guaranteed from the beginning). And we choose two such fermions because two is indeed the minimum number where correlations may be nontrivially studied. Great. There is absolutely nothing special or clever about these choices. If you choose a greater number of subsystems, they would carry different information than just two qubits, and/or if you chose a more general initial state, it would be almost certain that you could still derive a Bell-like inequality proving that a local classical theory cannot produce the same predictions as quantum mechanics.

The singlet state has the same form in any coordinate system in the regular 3D space. Rotate the $z$-axis, reparameterize the states into superpositions of "up" and "down" states with respect to a new axis, and it will still have the form "up down minus down up" (times a possible phase).

Note that if you calculate the reduced density matrix for the first fermion only, you will get$\rho_1 = \frac{\ket\uparrow\bra\uparrow + \ket\downarrow\bra\downarrow}{2}$ which is the identity matrix (divided by two). The unit matrix represents the "maximally ignorant" density matrix about the fermion.

Quantum calculation of a correlation

OK, the quantity we want to calculate and measure is simply$P(\hat a,\hat b ) = \langle \sigma^{(a)}_{\hat a}\sigma^{(b)}_{\hat b} \rangle$ We choose the axis $\hat a$ for the first fermion, $\hat b$ for the second fermion, and compute or measure the correlation between the angular momenta. Note that the internal angular momentum, the spin, is $\hbar/2$ times the Pauli sigma matrix. The normalization factors would be just stupidly added everywhere so we will calculate with the Pauli matrices everywhere. The eigenvalues of Pauli matrices are $\pm 1$ – the possible results of any measurement of a Pauli matrix.

Also note that $\sigma_{\hat a} = \hat a \cdot \vec \sigma$ and so on.

Now, the quantum mechanical calculation of that correlation $P$ is really straightforward and elegant – and as you will see, it shows that quantum mechanics is highly unique, predictive, natural, and in some fundamental sense simpler than any classical theory that could be invented to "fake" quantum mechanics. Without a loss of generality, we may pick $\hat a$ to be along the $z$-axis and $\hat b$ to be another vector in the $xz$-plane:$\sigma_{\hat a} = \sigma_z, \qquad \sigma_{\hat b} = \cos\vartheta\,\, \sigma_z + \sin\vartheta\,\, \sigma_x$ The correlation $P$ is simply$\begin{eqnarray} \langle \sigma_{\hat a}\sigma_{\hat b} \rangle &=& \frac{\bra\uparrow\bra\downarrow - \bra\downarrow\bra\uparrow}{\sqrt{2}} \sigma_{z}^{(a)} \times\nonumber\\ &\times& (\cos\vartheta\,\, \sigma_z^{(b)} + \sin\vartheta\,\, \sigma_x^{(b)}) \frac{\ket\uparrow\ket\downarrow - \ket\downarrow\ket\uparrow}{\sqrt{2}}=\nonumber\\ &=&-\frac{\cos\vartheta}{2} + 0 + 0 - \frac{\cos\vartheta}{2} = -\cos\vartheta = -\hat a \cdot \hat b \nonumber \end{eqnarray}$ We have evaluated the expectation value of the operator in each of the four bra-ket combinations that follow from the distribution law. And we have also used the well-known matrix elements of the Pauli matrices with respect to $\ket\uparrow$ and $\ket\downarrow$: $\bra\uparrow \sigma_z\ket\uparrow = 1 = - \bra\downarrow \sigma_z \ket\downarrow$. The expectation value of the term proportional to $\sin\vartheta$ never contributed because these terms involve one $\sigma_x$ which maps $\ket\uparrow\ket\downarrow$ and $\ket\downarrow\ket\uparrow$ to $\ket\uparrow\ket\uparrow$ or $\ket\downarrow\ket\downarrow$, and the last two have a vanishing inner product with the bra-versions of the first two.

Great. Quantum mechanics just guarantees that correlation of the two spins to be "minus cosine of the angle" between the two axes. It's the simplest, most natural result that interpolates between the predetermined values $\pm 1$ for $\hat b = \pm \hat a$. Quantum mechanics makes it unavoidable and the calculation is a really straightforward piece of linear algebra.

Local hidden variables

OK, quantum mechanics gives us $P(\hat a,\hat b ) = - \cos\vartheta$. This anticorrelation between the two spins' projections is calculable (because it was just calculated) from the initial state and nothing else. In other words, it is determined from the beginning when the subsystems were in contact. That's an easy way to see that no alteration of the two fermions is needed later (in the epoch of the measurements of individual spins) to achieve the anticorrelation. In particular, no action at a distance takes place. The degree of anticorrelation is there from the beginning – and it shows up at the end, too.

What about the local theories of hidden variables? If we measure the spin of the particle $A$, we obtain a result $\pm 1$ that depends on the vector $\hat a$ defining the axis and on some extra hidden variables that we call $\lambda$. Those hidden variables are some extra hypothetical quantities that aren't measured, perhaps are hard to measure, and for different values of $\lambda$, the spin of the particle $A$ may be $+1$ while for others it's $-1$ – which is why they're not just some irrelevant decoupled variables. The inclusion of the hidden variables is a way to "fake" the uncertain result of the spin measurement. Because $\lambda$ may be assumed to be "random", the measurement of the spin of $A$ is only determined probabilistically by some averaging over $\lambda$.

This value of the spin is called $A(\hat a,\lambda)=\pm 1$ and it does not depend on $\hat b$ because the choice of the axis in the other measurement (choice made by "Bob") can't influence the result of the measurement $A$ in a local theory (it cannot influence the result obtained by "Alice"). In a similar way, the spin $B(\hat b,\lambda)$ only depends on the axis $\hat b$ and the hidden variables $\lambda$ but not on $\hat a$.

Now we know from experiments that if we choose the axes identical $\hat a = \hat b$, we must obtain the opposite spins by the conservation of angular momentum which means that$A(\hat a,\lambda) = -B(\hat a,\lambda).$ Also note that $A(-\hat a,\lambda)=-A(\hat a,\lambda)$ and similarly for $B$ but we won't really need these simple statements about the "opposite axes". The average product of the spins is then given by the averaging of the product over the hidden variables$P(\hat a, \hat b) = \int d\mu(\lambda) A(\hat a,\lambda) B(\hat b,\lambda) = -\! \int \! A(\hat a,\lambda) A(\hat b,\lambda)$ where $d\mu(\lambda) = \rho(\lambda) d\lambda$ is an integration measure that determines the probability distribution for the hidden variables – and I switched to writing simply $\int$ instead of $\int d\mu(\lambda)$. The measure is normalized to $\int 1 = 1$.

To find some clearly visible difference between quantum and classical physics, we actually need three axes and not just two to measure the projections of the spins. We may replace $\hat b$ by another axis $\hat c$ to write the same formula as above for $P(\hat a,\hat c)$ as well as the following difference:$\begin{eqnarray} \dots & =& P(\hat a, \hat b) - P(\hat a, \hat c) = \nonumber\\ \\ &=& \int \left[ - A(\hat a,\lambda) A(\hat b,\lambda) + A(\hat a,\lambda) A(\hat c,\lambda) \right]=\nonumber\\ &=& - \int A(\hat a,\lambda) A(\hat b,\lambda) \left[ 1 - A(\hat b,\lambda) A(\hat c,\lambda) \right]\nonumber \end{eqnarray}$ In the last step, we used $A^2(\hat b,\lambda)=+1$. Next you should notice that$\left| A(\hat b,\lambda) A(\hat c,\lambda)\right| \leq 1\,\Rightarrow\, \left[ 1 - A(\hat b,\lambda) A(\hat c,\lambda) \right] \geq 0$ which implies that$\left| P(\hat a, \hat b) - P(\hat a, \hat c)\right| \leq \int \left[ 1 - A(\hat b,\lambda) A(\hat c,\lambda) \right]$ By identifying the last term as a definition of the original $-P$ but for axes $\hat b$ and $\hat c$, we can finally write down "the" Bell's inequality, a triangle-like inequality for three axes$\left| P(\hat a, \hat b) - P(\hat a, \hat c)\right| \leq 1 + P(\hat b,\hat c)$ Is it satisfied by the result from quantum mechanics? That would mean that$|\cos\vartheta_{ab} - \cos\vartheta_{ac}| \leq 1 - \cos\vartheta_{bc}$ However, this inequality (Bell's inequality) is easily violated in quantum mechanics (as well as by the experiments). For example, choose $\hat a, \hat b, \hat c$ in the same plane and $\hat a,\hat b$ orthogonal so that $\cos\vartheta_{ab}=0$. Then the inequality says that$|\cos\vartheta_{ac}| \leq 1 - \sin \vartheta_{ac}$ which is actually violated for any $0 \lt \vartheta_{ac} \lt \pi/2$; for example, for $\vartheta_{ac}=45^\circ$ we obtain a wrong inequality $.707 \lt .293$. This means that the experimentally verified quantum mechanical prediction cannot be computed from a "classical" theory with local hidden variables – and most likely, other hidden variable theories fail, too.

Discussion of the difference between quantum and classical physics

Note that the previous section was dedicated to the calculation in classical physics. We didn't really know what the values $A(\hat a,\lambda)=\pm 1$ were. Classical physics didn't even tell us what the hidden variables $\lambda$ should be. Whatever choice of $\lambda$ you would choose, there would still be no natural and rotationally covariant choice of $A(\hat a,\lambda)$. Classical physics is absolute mess if you try to use it to emulate the "binary" character of the spin in quantum mechanics.

But quantum mechanics makes all such things elegant.

Why could we show that no local classical theory was capable of producing the quantum result? Because classical physics is different from quantum mechanics, stupid. In classical physics, as I said, all the probabilistic knowledge is given by probability distributions on the phase space. Because the phase space contains mutually exclusive states, its elements are analogous to a basis of the Hilbert space in quantum mechanics. So the classical probability distribution is a diagonal matrix on a would-be basis.

On the other hand, the corresponding object – the density matrix in quantum mechanics – is a general matrix with off-diagonal matrix elements. All the continuous unitary rotations of states into each other are allowed in quantum mechanics. Generic operators and matrices – observables as well as density matrices – use these continuous rotations and continuous superpositions all the time. All these things are completely forbidden in classical physics. It's simply not surprising that classical physics whose rules are constrained – the off-diagonal elements in between the mutually exclusive states are forbidden everywhere – isn't capable of getting pure enough results (like strong enough correlations) that you may get in the more general quantum mechanics!

Note that classical physics really sucks. It failed to be predictive and didn't tell you what the hidden variables $\lambda$ were. Even if you pick some, the classical theory didn't tell you how much $A(\hat a,\lambda)=\pm 1$ is for various values of the direction $\hat a$ and the hidden variables $\lambda$. So you had lots of "freedom" – meaning the lack of predictivity of the theory. But despite this freedom, whatever choice of $\lambda$ and $A(\hat a,\lambda)$ you make, you will still be guaranteed that the correct quantum mechanical result is not reproduced.

So "some freedom" just isn't enough. If you start with too wrong or constrained a system, like the framework of classical physics, some freedom – even if it looks like "a lot of freedom" – is often insufficient to obtain the right result (to "fit the elephant"). There isn't any way to force local classical theories to produce the right result!

Locality: in classical and quantum physics

You may generalize classical physics so that it seemingly fakes the correct quantum mechanical theory. For example, you may invent a theory in which the wave function isn't a probabilistic variable analogous to the probability distribution on the phase space – and instead, it is a classical wave. If you allow this theory to be non-local i.e. for these objects to collapse in a specific way after Bob makes his measurement, the result of Alice may agree with the observations – and with quantum mechanics.

In other words, you may fake some results of quantum mechanics by "non-local classical theories".

However, if you have a fundamentally non-local classical theory, it fundamentally contradicts the special theory of relativity and it is virtually certain that these violations of locality and relativity will be visible in some experiments.

On the other hand, quantum mechanics is perfectly local while it makes the correct prediction $-\cos\vartheta$ for the correlation between the spins. What do I mean by "local"? The word "local" means and has always meant that if you have the most general description of a subsystem, in this case the fermion $A$ and Alice who measures it – and in quantum mechanics, we learned that the description must be probabilistic – then these probabilities for $A$ only evolve independently of all events that occur at distant places i.e. independently of the evolution of the other fermion, $B$, as well as of Bob's decisions, especially his decisions what he wants to measure.

Why is it so? It's clearer in the Heisenberg picture. All the probabilities of conditions that may be measured to be Yes/No in Alice's lab are obtained as the matrix elements${\rm Tr} (\rho_A P_{L,A})$ where $\rho_A$ is the reduced density matrix for Alice's system and $P_{L,A}$ is the projection operator on the Yes subspace of the particular question we want to decide in Alice's lab. Now, in the Heisenberg picture, all these operators $\rho_A$ and $P_{L,A}$ are defined as matrices in the space that only has the Hilbert space of Alice's subsystem as the columns and row, and the time evolution of $P_{L,A}$ is given by the local equations of motion, too.

The Yes/No operator $P(L,A)$ is a functional of some local field operators in Alice's lab and those field operators evolve as locally in (relativistic) quantum field theory as they do in (relativistic) classical field theory. So all these probabilities will be independent of the evolution of some faraway field operators; and on some projection operators caused by Bob's measurements, too.

The mathematical property that guarantees the locality – the independence of Alice's and Bob's labs – is the vanishing of the commutators between the observables in Alice's lab; and the observables in Bob's lab. This vanishing of the commutators is guaranteed in non-relativistic physics for a simple reason: the quantum mechanical Hilbert space for the Alice+Bob composite system is constructed as the tensor product of Alice's and Bob's separate Hilbert spaces. And in tensor product Hilbert spaces, the corresponding operators $L_A\otimes 1$ and $1\otimes L_B$ always commute with each other. The position $x$ operator may refuse to commute with the differential $\partial / \partial x$ operator. But those in $L_A\otimes 1$ and $1\otimes L_B$ act on different variables $\vec x_A$ and $\vec x_B$ so they commute with each other, after all!

In quantum field theory, the description of the Alice+Bob composite system isn't constructed as a tensor product. Instead, both Alice and Bob live in the same space that also has some volume in the middle in between them. But it's still true that the (graded) commutators between operators at spatially separated points commute in quantum field theory. This is what guarantees the independent evolution of the probabilities in Alice's lab – independent of the smooth and unitary evolution in Bob's lab; and independent of the would-be unsmooth projections associated with Bob's decisions to measure something.

It's important to realize that locality doesn't mean that there are no correlations. Even in local classical physics, two subsystems that were created at the same place in a correlated way – e.g. Bertlmann's socks (a crazy physicist in Vienna always has one red sock and one green sock but in the morning, it's not yet clear whether the green is the left one or vice versa, note that both greens and red Bolsheviks are left-wing) – are correlated with each other. This doesn't prove any non-locality. The exact same comment applies to quantum mechanics. There is nothing non-local about the perfect anticorrelations in quantum mechanics. There is no non-locality because the effect of evolution in Bob's lab and Bob's choice of the axis on the probabilities of purely Alice-lab-based questions is strictly equal to zero. Correlations between subsystems generally exist but all of them are always explained by the joint birth or co-existence or interactions between the subsystems in the past.

Once Alice and Bob etc. choose what they want to measure, they pick a basis of the Hilbert space. The density matrix may be rewritten in that basis and the off-diagonal elements of the density matrix in that basis become physically inconsequential. Meanwhile, when the relevant basis is chosen, the diagonal entries of the density matrix determine the probabilities just like the probability distributions did in classical physics. So the anticorrelation between quantum spins, "up-down" or "down-up", is an anticorrelation of the exactly same type as "red-green" or "green-red" between two Bertlmann's socks. None of these two anticorrelations depends on any action at a distance.

You would need non-locality in a classical theory to fake the quantum results. But that doesn't mean that our Universe is non-local because your classical theory – and any classical theory – is just wrong. In particular, the correct theory – quantum mechanics – says that the wave function is not an actual observable. It means that its collapse isn't an "objective phenomenon" that could cause some non-local influences. Instead, it may always be interpreted as the observer's learning of some new information.

Is this statement – the wave function is not observable – true? It's postulated to be true in quantum mechanics. You may call it "the Copenhagen Interpretation" but that's just a demagogic language attempting to delegitimize or obfuscate this fundamental rule of quantum mechanics. If the wave function in a "theory" is something else than a package of probabilistic objects describing the knowledge of an observer, then the "theory" isn't quantum mechanics! Also, this postulate of quantum mechanics may be verified experimentally. Whatever you do, you will not be capable of measuring the wave function (in a single copy of the experimental situation). So the wave function is just someone's probabilistic knowledge about the world and when it "collapses", it's just a change occurring in this observer's mind. It doesn't automatically imply any non-local action. Only if you could change the probabilities of Yes/No outcomes for a faraway lab, you would be the owner of non-local or voodoo powers. But it's calculable in quantum mechanics – and experimentally verifiable – that this influence at a distance is zero.

You may redefine the word "non-local" so that it means something else than the "effects in principle capable of willfully changing the probabilities for a faraway region". But if you redefine the adjective, e.g. so that "non-local" means "proving that 1+1=3" or "non-classical", it won't prove that the world is non-local. Instead, it will just prove that you are a moron because basic words such as "non-local" cannot really be redefined. Although he was a bit shy, John Bell was the first one among these morons when he pushed the totally wrong conclusions and terminology – but his contemporary disciples are way more moronic, of course.

Classical means non-quantum; the interpreters' synonym of "classical" is "realist"

I have completely avoided the word "realist" so far. Instead, I have used the word "classical". Again, a classical theory is a theory assuming that the system may always be found in one state – taken from a set known as the "phase space" – and the different states (elements of the phase space) are mutually exclusive with each other. Again, this basic assumption is totally violated in quantum mechanics where one may build complex superpositions interpolating between all the basis vectors – and generic superpositions are not mutually exclusive at all because they're not mutually orthogonal. States are only mutually exclusive if they're orthogonal.

The "interpreters" i.e. anti-quantum zealots are avoiding the word "classical" and use a more cryptic term "realist" – or no adjective at all – because they aren't even capable or willing to admit that classical physics could be wrong. So the assumption of classical physics, as I sketched it in the previous paragraph again, is a dogma for them. It's taboo. They're not even allowed to talk about it, to admit that they're making an assumption about the existence of the "only good set of mutually excluding states". This assumption is demonstrably wrong but they're cultists who are just unable to give the assumption up. Many of them are fanatical about this wrong assumption of classical physics because they're dirty Marxists and Marxism forces its brain-dead followers to believe in a kind of "objective reality" that really implies the general framework of classical physics. Both Bohm and Bell were early "Marxist activists" within physics and the amount of similar weeds that contaminate the discussions around physics has increased dramatically in recent years. Quantum mechanics is one of the millions of proofs that Karl Marx was full of šit but his fans don't want to allow any such proof.

Quantum mechanics says that the observations always affect the system and the observers and their choice of the questions therefore always matter for the results. This is hard to swallow for the Marxists – and Marxists Lite. But it is true. One can formulate the same novelty of quantum mechanics in various ways that sound more or less philosophical, more or less provocative, more or less spiritual, but the content is always exactly the same.

I need to point out that measurements done by Alice lead to the change of her wave function or density matrix and that changes her predictions for Bob's lab, too. But it's not an action at a distance because she can't willfully affect the result she will get – so she can't affect Bob's results, either. The measurement outcomes will be correlated in various ways but that correlation can be attributed neither to Alice's or Bob's free will nor to recent events in Alice's and Bob's labs. All these correlations boil down to the joint co-existence of the subsystems (two fermions, in our case) in the past. Just Alice's choice what she wants to measure may be proven to have no effect on the probabilities in Bob's lab. That fact mathematically boils down to the vanishing commutators; and/or the completeness relations that allow Alice's unit operator to be written as $\sum_i \ket i \bra i$ over any orthonormal basis.

As I said, there is one "really non-ideologically sounding" way to describe the difference between classical physics and quantum mechanics. Probability distributions are "diagonal" in the mutually exclusive states' basis in classical physics (probability distributions on phase space); but in quantum mechanics, the general probability distributions are matrices (density matrices) that also include all the off-diagonal entries. Quantum mechanics enables the off-diagonal complex matrix entries of probability distributions; classical physics was forbidding them. That's another way to see why classical physics is a "special case or limit" of quantum mechanics.

If you're used to thinking within classical physics, off-diagonal elements in probability distributions (in between mutually exclusive states) may sound counterintuitive. But that is just your psychological problem, not a problem of quantum mechanics. What matters is that the density matrices – with off-diagonal entries which are also included in all the generic observables – are logically consistent and compatible with all the observations. Indeed, all the off-diagonal matrix elements of the density matrix are ultimately "forgotten" during the measurement. But the funny new trick is that the observer has a control over the choice of the basis – the observable that is measured – so he or she has a control about which part of the information in the density matrix matters and which part is forgotten! The "freedom to have the complex superpositions" is what makes the observers relevant. Someone has to choose the basis in which the off-diagonal entries will be forgotten! This is how "the non-zero commutators, uncertainty principle, off-diagonal matrix entries, and all that" is equivalent to the "fundamental role played by observers in quantum mechanics". Do you get it? The former sounds mathematical in nature, clearly needed for the right mathematical description etc.; the latter sounds "strange, controversial, ideological, spiritual". But they are really the same thing. Because the things are matrix-like and non-commuting, the observer must matter when the laws of quantum mechanics are being applied! The observer is who chooses the relevant basis, the relevant perspective from Bohr's comments about the complementarity etc., and this choice is absolutely critical if the laws of quantum mechanics are physically applied at all. No doubter or interpreter of quantum mechanics understands this elementary point so all of them are wrong about virtually everything they say. They typically want to preserve the mathematics of quantum mechanics but eradicate all the observers etc. But that's a logically inconsistent position because the observer or the observation is nothing else than "the entity that may be said to be responsible for choosing the relevant basis of the Hilbert space during the application of the apparatus" which is clearly needed given the existence of infinitely many, continuously connected choices.

In quantum mechanics, we can test not only whether the system is found in one of the points in the phase space, mutually exclusive predetermined states. Instead, we may do measurements whether the system is in any state of the Hilbert space – and those can continuously change from one state to another because they're complex superpositions of each other. The probabilities for any such states (eigenstates of any observables) are calculable from the well-defined formalism. Because the probabilities are the only aspects of the laws of physics that may be measured (by repeating the experiments many times), a theory that allows you to calculate them is a complete theory.

Quantum mechanics is a complete theory and the successful non-gravitational quantum mechanical theories also respect the independence of the faraway subsystems. In particular, relativistic quantum mechanical theories without gravity are quantum field theories that are local. All probabilities of outcomes obtained in one region only are completely independent of the events and decisions made in any spacelike-separated region. That boils down to the vanishing (graded) commutators between spacelike-separated field operators.

I've spent an hour by writing this text again but tons of people write whole books – with hundreds of pages – and they are still incapable of understanding the totally straightforward conclusions. At least 50% of their statements keep on being completely wrong. What is so difficult about these insights?