Friday, March 11, 2016 ... /////

Measurement isn't a violation of unitarity

In the mid 1920s, thanks to the conservative yet revolutionary work by giants such as Heisenberg, the foundations of physics have switched from the old framework of classical physics to the postulates of quantum mechanics.

The new general rules have been completely understood. Since that time, only the search for the "right Hamiltonians" and "their implications" was open. The new philosophical underpinnings were shown to be consistent, complete, nothing has changed about them since 1925 (or we might put the threshold to 1927 so that the papers clarifying the uncertainty principle etc. are included), and all the evidence suggests that there's no reason to expect any change to these basic philosophical foundations of physics in the future.

Florin Moldoveanu doesn't like these facts because much of his work (and maybe most of it) is based on the denial of the fact that quantum mechanics works and it works remarkably well. So he wrote, among other things:

Luboš sees no value in the quantum foundations community because the proper interpretation was settled in his opinion long time ago and all quantum foundations practitioners must be crackpots (obviously there is no love lost between the quantum foundation community and Luboš).
Apparently to show that something hasn't been clear about the basic rules of the game since the 1920s, he wrote a blog post dominated by the basic introduction to the Leibniz identity, the Jacobi identity, and tensor products. Are you joking, Florin? While the universal postulates of quantum mechanics have been known since the 1920s, the "fancy new hi-tech topics" that you discussed now have been known at least since the 19th century!

Moldoveanu wants to impress you by the (freshman undergraduate algebra) insight that the Hermiticity is related to unitarity and the Leibniz identity, and so on. The precise "equivalences" he describes are confused – it is the Hermiticity (of the Lie algebra generators), and not the Leibniz identity, that is equivalent to the unitarity (of Lie group elements). The Leibniz identity works even for non-unitary operations and it is how the differential generators always act on product representations (e.g. composite systems in quantum physics), according to the dictionary between the Lie groups and Lie algebras.

But I don't want to analyze his technical confusion which is intense. It would be easier to make him forget about everything he knows and teach him from scratch than to try to fix all his mistakes.

I want to focus on the big picture – and the historical picture. To argue that something hasn't been settled since the 1920s, he talks about the Leibniz identity, the Jacobi identity, the rules of the Lie groups and the Lie algebras. Are you listening to yourself, Florin? Leibniz lived between 1646 and 1716 so one should be able to figure out that the Leibniz identity probably wasn't born after 1925.

Even more relevant is the history of Lie groups and Lie algebras. Sophus Lie did most of this work between 1869 and 1874. The Lie algebra commutators were understood to obey the Jacobi identity which has been known before Lie did his key contributions. Most of Jacobi's work was published in the 1860s but the publication was posthumous: Jacobi lived from 1804 to 1851. Killing and Cartan added their knowledge about the Cartan subalgebra and maximal tori etc. in the 1880s. All this mathematical apparatus was ready decades before physicists made their new insights about the foundations of physics in the 1920s.

In the same way, mathematicians understood the representation theory. For example, if there are two independent objects, their properties are described by two sets of operators. The two sets commute with one another – this is clearly needed for the two objects to exist independently. The Hilbert space is a representation which means that the minimum Hilbert space transforming under both sets of observables has to be a tensor product. The relevance of the tensor product ${\mathcal H}_A\otimes {\mathcal H}_B$ for the quantum description of a composite system has been immediately obvious when quantum mechanics was presented. The mathematical underpinnings had been known for decades – and, in fact, Heisenberg had no trouble to rediscover the matrix calculus when he needed it. The tensor product Hilbert space appears because it's a representation of the group $G_A\times G_B$, a direct product that is needed to describe the observables of two parts of the composite system.

Florin, are you really serious when you present these basic things as justifications of your claim that something fundamental about the general rules of quantum mechanics hasn't been clear from the 1920s?

Even though most of his blog post is dedicated to these basic, mostly 19th century, mathematical insights, the title reads
Why are unitarity violations fatal for quantum mechanics?
Indeed, unitarity violations would be fatal for a quantum mechanical theory – they would prevent the sum of all probabilities of mutually exclusive outcomes from being equal to 100 percent. However,
there is just no violation of unitarity in the theories we actually use to describe Nature.
Unitarity is indeed a universal rule – it is the quantum counterpart of some axioms in the usual probability calculus (where the sum of probabilities of different options is always 100 percent). Why does Moldoveanu think otherwise?

He thinks otherwise because he believes that the measurement introduces non-unitarity to quantum mechanics. The word "non-unitarity" only appears in the following sentence of his text:
Because the measurement problem must explain the non-unitary collapse, and since non-unitarity makes the mathematical framework of quantum mechanics inconsistent, the mathematical solution ultimately points out the right interpretation.
Sadly, this critical sentence is completely wrong. This has some implications. For example, this wrongness invalidates almost all papers by Moldoveanu that use the word "unitarity" because he just doesn't know what this condition is, when it holds, and whether it holds.

The unitarity is a condition in quantum mechanics that imposes the rule that "probabilities add up to 100 percent" within the quantum formalism. But what is unitarity more accurately? By unitarity, quantum physicists mean exactly the same thing as the 19th century mathematicians. So the matrix $U$ with the matrix entries $U_{ij}$ – and similarly for an operator that may be defined without a choice of a basis i.e. without indices – is the condition$U U^\dagger={\bf 1}\quad {i.e.} \quad \sum_j U_{ij}U^\dagger_{jk} = \delta_{ik}.$ This is the unitarity. In quantum mechanics, it holds whenever $U$ is an evolution operator (by a finite or infinite time i.e. the S-matrix is included) – or the operator of any finite transformation, for that matter (e.g. the rotation of a system by an angle).

The evolution operators in non-relativistic quantum mechanics, Quantum Electrodynamics, the Standard Model, and string theory (among many other theories) perfectly obey this condition. That's why we say that all these quantum mechanical theories are unitary – they pass this particular health test.

Moldoveanu and tons of other anti-quantum zealots want to contradict this statement by pretending that the measurement of a quantum system is a modification of Schrödinger's equation that deviates from the action of the unitary evolution operators above, and is therefore non-unitary.

But that's a result of a completely wrong and sloppy thinking about all the concepts.

The collapse doesn't mean that there is a violation of unitarity. To understand this simple sentence, we must be careful and look "what are the objects that are unitary". The answer is that the unitary matrices such as $U$ above are the
matrices whose entries are the probability amplitudes.
The general postulate in quantum mechanics that we have referred to is that the matrices of evolution operators' probability amplitudes – between the basis of possible initial states and the basis of the possible final states – are unitary. And be sure that they are and the measurements don't change anything about it.

Why don't they change anything about it? Because the "sudden collapse of the wave function" that the measurement induces isn't a modification of the evolution operator or a deformation of Schrödinger's equation. Instead, the "sudden collapse" is an interpretation of the wave function.

Quantum mechanics says that after the measurement, one of the possible outcomes becomes true. It "even" allows us to calculate the probabilities of the individual outcomes. But the very fact that quantum mechanics says something about the probabilities of the outcomes implicitly means that one of the outcomes will become the truth after the measurement. This simple claim is implicitly included in all the rules of quantum mechanics. We may obviously add it explicitly, too.

When we measure whether a cat is dead or alive, and quantum mechanics predicts the probabilities to be 36% and 64%, there can't be any "vague mixed semi-dead semi-alive" state of the cat after the measurement. This claim logically follows from the statement that the "probabilities of dead and alive are 36% and 64%" and it doesn't need any additional explanation.

If it were possible for the measurement of the cat to yield some vague "semi-dead semi-alive" outcome, the probabilistic statement would have to allow this option. To do so, quantum mechanics would have to predict that the "probability is 30% for dead, 60% for alive, and 10% for some semi-dead semi-alive fuzzy mixture". But when the laws of quantum mechanics omit the third option, it means that this option's probability is 0% which means that it is impossible for the post-measurement state to be semi-dead, semi-alive. If you need some extra explanations or repetitions of this fact, that the ill-defined post-measurement outcomes are banned by quantum mechanics, then it is because you are retarded, Florin, not because the foundations of quantum mechanics need some extra work.

The ultimate reason why Moldoveanu and others refuse to understand simple points like that – e.g. the point that there is no non-unitarity added by the measurement – is that they are refusing to think quantum mechanically. When we say that the matrix entries of an evolution operator are probability amplitudes, we understand it but the likes of Moldoveanu don't. They may hear the words but they ignore their content.

They totally overlook the fact that the matrix entries that decide about the unitarity are probability amplitudes. They just think that they are some classical degrees of freedom (that objectively exist and don't require observers), Schrödinger's equation is a classical evolution equation, and the measurement must be "modelled" as an exception for the Schrödinger's equation or its deformation of a sort.

But all these assumptions are completely wrong. The wave function is not a classical wave. It is not a set of classical degrees of freedom. Schrödinger's equation isn't an example of a classical evolution equation. And the measurement isn't described by anything that looks like an equation for the evolution at all. The measurement yields sharp outcomes because quantum mechanics postulates that there are sharp outcomes – the spectrum of an operator lists all the a priori possible outcomes – and it tells you how to calculate their probabilities from the complex probability amplitudes.

It's only the probability amplitudes that may be meaningfully organized into linear operators and therefore matrices. If you want to engineer some "action on a wave function that also visualizes the collapse", then you are trying to construct a classical model describing the reality. You are not doing a proper quantum analysis of the problem. And if you created such a model where the wave function is a classical field that "collapses" according to some equations, the "operation" wouldn't even be linear, so it wouldn't make sense to ask whether it's unitary.

In fact, the operation on the initial wave function wouldn't even be a map because even if the initial state is exactly the same twice, the final outcomes may be different – because of the quantum indeterminacy or randomness. Because this operation assigning the final state isn't even a map (because the final outcomes of the measurements aren't uniquely determined by the initial state), it makes absolutely no sense to talk about its being unitary. Of course it can't be correctly shown to be unitary. It can't be unitary if it is not even a map! Only for maps, and yes, you need linear maps, you can meaningfully talk about their being unitary. For other "processes", the adjective is ill-defined (like "whether the number five is green"). The "operation of the collapse" on the wave function isn't unitary but it isn't non-unitary, either. It isn't a map so it's meaningless to talk about its being unitary.

And if you managed to "redefine" the transformations in some way so that the act of the measurement would count as "non-unitary evolution", despite its randomness (failure to be a map) and nonlinearity, then it wouldn't be a problem, anyway. What's needed for consistency of the theory is the unitarity of the pre-measurement probability amplitudes (because the unitarity plays the same role as the conditions for probabilities that should add to 100 percent etc.), not some probability amplitudes modified by random-generator-dependent "collapses". So even if the collapse were redefined as a "non-unitary evolution of a sort", it just wouldn't mean that there is a problem to worry about or to solve.

Again, in the normal approach, the object whose unitarity is a meaningful question is the matrix/operator of the probability amplitudes (defining an evolution or a transformation). Those don't contain any "collapses" because the very meaning of the word "probability" is that we substitute the "widespread" distributions "before" we know a particular outcome i.e. without any collapses. And the matrices of probability amplitudes for evolution operators must be unitary in all logically consistent quantum mechanical theories.

Even if you are a bit confused about the logic, you should be able to understand that there is almost certainly "nothing intelligent and deep" waiting to be found here. Moldoveanu's and similar people's "work on the foundations" is just an artifact of their inability to understand some very simple logical arguments fully described above – and at many other places. They're crackpots but like most crackpots, they work with the assumption that they can never be wrong. That's not a good starting point to understand modern physics.

Pedagogic bonus: from classical physics to quantum mechanics

I am afraid that I have written very similar things to this appendix in the past. But even if it is the case and the text below fails to be original, repetition may sometimes be helpful. Here's a way to see in what way quantum mechanics generalizes classical physics – and why it's foolish to try to look for some "problems" or "cure to problems" in the process of the measurement.

A theory in classical mechanics may be written in terms of the equations for the variables $x(t),p(t)$$\frac{dx}{dt} = \frac{\partial H}{\partial p}, \quad \frac{dp}{dt} = -\frac{\partial H}{\partial x}$ for some Hamiltonian function $H(x,p)$, OK? Now, classical physics allows the objective state at every moment i.e. the functions $x(t),p(t)$ to be fully determined. But you may always switch to the probabilistic description which is useful and relevant if you don't know the exact values of $x(t),p(t)$ – everything that may be known. Introduce the probability distribution $\rho(x,p)$ on the phase space that is real and normalized,$\int dx\,dp\, \rho(x,p)=1.$ It's trivial to have many copies of $x,p$, just add an index, and rename some of the variables etc. Fine. What is the equation obeyed by the probability distribution $\rho(x,p;t)$? We are just uncertain about the initial state but we know the exact deterministic equations of motion. So we may unambiguously derive the equation obeyed by the probability distribution $\rho$. The result is the Liouville equation of statistical mechanics.

How do we derive and what it is? The derivation will be addressed to adult readers who know the Dirac delta-function. If the initial microstate is perfectly known to be $(x,p)=(x_0,p_0)$, then the distribution at that initial moment is$\rho(x,p) = \delta (x-x_0) \delta(p-p_0).$ With this initial state, how does the system evolve? Well, the $x,p$ variables are known at the beginning and the evolution is deterministic, so they will be known at all times. In other words, the distribution will always be a delta-function located at the right location,$\rho(x,p;t) = \delta [x-x(t)] \delta[p-p(t)]$ What is the differential equation obeyed by $\rho$? Calculate the partial derivative with respect to time. You will get, by the Leibniz rule and the rule for the derivative of a composite function,$\eq{ \frac{\partial \rho (x,p;t)}{\partial t} &= \delta'[x-x(t)] \dot x(t) \delta[p-p(t)]+\\ &+ \delta[x-x(t)] \delta'[p-p(t)] \dot p(t) }$ or, equivalently (if we realize that $\rho$ is the delta-function and substitute it back),$\frac{\partial\rho}{\partial t} = \frac{\partial \rho}{\partial x}\dot x(t)+\frac{\partial \rho}{\partial p}\dot p(t).$ This is the Liouville equation for the probabilistic distribution on the phase space, $\rho$. The funny thing is that this equation is linear in $\rho$. And because every initial distribution may be written as a continuous combination of such delta-functions and because the final probability should be a linear function of the initial probabilities, we may just combine all the delta-function-based basis vectors $\rho(x,p;t)$ corresponding to the classical trajectories $x(t),p(t)$, and we will get a general probability distribution that behaves properly.

In other words, because of the linearity in $\rho$ and because of the validity of the equation for a basis of functions $\rho(x,p;t)$, the last displayed equation, the Liouville equation, holds for all distributions $\rho(x,p;t)$.

Excellent. I emphasize that this Liouville equation is completely determined by the deterministic equations for $x(t),p(t)$. Aside from the totally universal, mathematical rules of the probability calculus, we didn't need anything to derive the Liouville equation. Nothing is missing in it. But when we measure an atom's location to be $x_1$, then the distribution $\rho(x,p;t)$ "collapses" because of Bayesian inference. We have learned some detailed information so our uncertainty has decreased. But this collapse doesn't need any "modifications" of the Liouville equation or further explanations because you may still assume that the underlying physics is a deterministic equation for $x(t),p(t)$ and all the $\rho$ stuff was only added to deal with our uncertainty and ignorance. The form of the Liouville equation is exact because it was the probabilistic counterpart directly derived from the deterministic equations for $x(t),p(t)$ which were exact, too.

What changes in quantum mechanics? The only thing that changes is that $xp-px=i\hbar$ rather than zero. This has the important consequence that the deterministic picture beneath everything in which $x(t),p(t)$ are well-defined $c$-number functions of time is no longer allowed. But the equation for $\rho$ is still OK.

Before we switch to quantum mechanics, we may substitute the Hamilton equations to get$\frac{\partial\rho}{\partial t} = \frac{\partial \rho}{\partial x}\frac{\partial H}{\partial p}-\frac{\partial \rho}{\partial p}\frac{\partial H}{\partial x}$ and realize that this form of the Liouville equation may be written in terms of the Poisson bracket$\frac{\partial \rho(x,p;t)}{\partial t} = \{\rho(x,p;t),H(t)\}_{\rm Poisson}.$ That's great (up to a conventional sign that may differ). This equation may be trusted even in quantum mechanics where you may imagine that $\rho$ is written as a function (imagine some Taylor expansion, if you have a psychological problem that this is too formal) of $x,p$. However, $x,p$ no longer commute, a technical novelty. But the density matrix $\rho$ in quantum mechanics plays the same role as the probability distribution on the classical phase space in classical physics. You may imagine that the latter is obtained from the former as the Wigner quasiprobability distribution.

Because of the usual, purely mathematically provable relationship between the Poisson brackets and the commutator, we may rewrite the last form of the Liouville equation as the von Neumann equation of quantum mechanics$\frac{d\rho(t)}{dt} = i\hbar [H,\rho(t)]$ that dictates the evolution of the density matrix or operator $\rho$. (Thankfully, people agree about the sign conventions of the commutator.) It can no longer be derived from a deterministic starting point where $x(t),p(t)$ are well-defined $c$-numbers – they cannot be sharply well-defined because of the uncertainty principle (i.e. nonzero commutator) – but the probabilities still exist and no modifications (let alone "non-unitary terms" etc.) are needed for the measurement. The measurement is just a version of the Bayesian inference. It's still basically the same thing but this inference must be carefully described in the new quantum formalism.

If you like Schrödinger's equation, it is not difficult to derive it from the von Neumann equation above. Any Hermitian matrix $\rho$ may be diagonalized and therefore written as a superposition$\rho = \sum_j p_j \ket{\psi_j}\bra{\psi_j}$ Because the von Neumann equation was linear in $\rho$, each term in the sum above will evolve "separately from others". So it is enough to know how $\rho=\ket\psi \bra\psi$ evolves. For this special form of the density matrix, the commutator$[H,\rho] = H\rho - \rho H = H\ket\psi \bra \psi - \ket\psi \bra \psi H$ and these two terms may be nicely interpreted as two terms in the Leibniz rule assuming Schrödinger's equation$i\hbar \frac{d\ket\psi}{dt} = H\ket\psi$ and its Hermitian conjugate$-i\hbar \frac{d\bra\psi}{dt} = \bra\psi H.$ So if the wave function $\ket\psi$ obeys this equation (and its conjugate), then the von Neumann equation for $\rho=\ket\psi\bra\psi$ will follow from that. The implication works in the opposite way as well (Schrödinger's equation follows from the von Neumann equation if we assume the density matrix to describe a "pure state") – except that the overall phase of $\ket\psi$ may be changed in a general time-dependent way.

The pure state $\ket\psi$ corresponds to the "maximum knowledge" in the density matrix $\rho=\ket\psi\bra\psi$. In quantum mechanics, it still leads to probabilistic predictions for most questions, because of the uncertainty principle. Mixed states are superpositions of terms of the form $\ket{\psi_i}\bra{\psi_i}$. The coefficients or weights are probabilities and this way of taking mixtures is completely analogous (and, in the $\hbar\to 0$ limit, reduces) to classical probability distributions that are also "weighted mixtures".

Because we have deduced the quantum equations from the classical ones, it's as silly as it was in classical physics to demand some "further explanations" of the measurement, some "extra mechanisms" that allow the unambiguous result to be produced. In classical physics, it's manifestly silly to do so because we may always imagine that the exact positions $x(t),p(t)$ have always existed – we just didn't know what they were and that's why we have used $\rho$. When we learn, the probability distribution encoding our knowledge suddenly shrinks. End of the story.

In quantum mechanics, we don't know the exact values $x(t),p(t)$ at a given time. In fact, we know that no one can know them because they can't simultaneously exist, thanks to the uncertainty principle. But the probabilistic statements about $x,p$ do exist and do work, just like they did in classical statistical physics. But the Schrödinger or von Neumann equation is "as complete" and "as perfectly beautiful" as their counterpart in classical physics, the Liouville equation of statistical physics. The latter was ultimately derived (and no adjustments or approximations were needed at all) from the deterministic equations for $x(t),p(t)$ that the critics of quantum mechanics approve. We just allowed some ignorance on top of the equations for $x(t),p(t)$ and the Liouville equation followed via the rules of the probability calculus.

So the Liouville equation just can't be "less satisfactory" than the classical deterministic laws for $x(t),p(t)$. Nothing is missing. And the von Neumann and Schrödinger equations are exactly analogous equations to the Liouville equation – but in systems where $xp-px=i\hbar$ is no longer zero. So the von Neumann or Schrödinger equations must unavoidably be complete and perfectly satisfactory, too. They still describe the evolution of some probabilities – and, we must admit because of the imaginary nonzero commutator, complex probability amplitudes. Because of the uncertainty principle, some ignorance and uncertainty – and probabilities strictly between 0 and 100 percent – are unavoidable in quantum mechanics. But the system of laws is exactly as complete as it was in classical statistical physics. No special explanation or mechanism is needed for the measurement because the measurement is still nothing else than a process of the reduction of our ignorance. In this process, $\rho$ suddenly "shrinks" because it's one step in Bayesian inference. It has always been.

In classical physics, this Bayesian inference may be thought of as our effort of learning about some "objectively existing truth". In quantum mechanics, no objective truth about the observables may exist because of the uncertainty principle. But the measurement is still a process analogous to the Bayesian inference. It improves our subjective knowledge – shrinks the probability distribution – as a function of the measured quantity. But because of the nonzero commutator, the measurement increases the uncertainty of the observables that "maximally" fail to commute with the measured one. So the measurement reduces (well, eliminates) our uncertainty about the thing we measure, but it affects other quantities and increases our uncertainty about other quantities.

In quantum mechanics, our measurements are not informing us about some "God's and everyone's objective truth" (as in classical physics) because none exists. But they're steps in learning about "our subjective truth" that is damn real for us because all of our lives will depend on the events we perceive. In most practical situations, the truth is "approximately objective" (or "some approximate truth is objective"). Fundamentally, the truth is subjective but equally important for each observer as the objective truth was in classical physics.

But just try to think about someone who says that a "special modification of the Liouville equations of motion" is needed for the event when we look at a die that was tossed and see a number. The probability distribution $\rho$ collapses. Well, there is nothing magic about this collapse. We are just learning about a property of the die we didn't know about – but we do know it after the measurement. The sudden collapse represents our learning, the Bayesian inference. In classical physics, we may imagine that what we're learning is some "objective truth about the observables" that existed independently of all observers and was the "ultimate beacon" for all observers who want to learn about the world. In quantum mechanics, no such "shared objective truth" is possible but it's still true that the measurement is an event when we're learning about something and the collapse of the wave function (or density matrix) is no more mysterious than the change of the probabilities after the Bayesian inference that existed even in classical physics.

I am confident – and I saw evidence – that many of you have understood these rather crystal clear facts about the relationship between classical physics, quantum mechanics, measurements, and probabilities. But maybe people like Florin Moldoveanu don't want to understand. Maybe it's natural to expect them not to understand these simple things because their jobs often depend on their continued ignorance, confusion, and stupidity.