Saturday, December 17, 2011

EPR correlations in Heisenberg picture

Niels Bohr, Werner Heisenberg, and others realized that physics isn't a framework to describe how the world is; physics is human activity studying the true statements we may say about our observations.

Bohr, Heisenberg, Pauli

Classical physics may of course be phrased in this way as well. In classical physics, we may make a statement about the initial state, for example
\[ \vec x = (x,y,z), \qquad \vec v = (v_x,v_y,v_z) \] where the components are particular numbers. The dynamical equations of classical physics (imagine a planet orbiting the Sun) allow us to calculate \(\vec x,\vec v\) in the final state, after some period of time, as some calculable functions of the values of positions and velocities in the initial state. So we may decide which statements about the final quantities of the type \(\vec v^{\rm final}=(u_x,u_y,u_z)\) are true and which of them are false.

From this viewpoint, quantum mechanics only changes one thing: the observables such as \(x,p\) no longer commute with each other. All the other new features of quantum mechanics really follow from this fact.

Quantum mechanics is often considered "completely different" than classical physics and it is. But if we use the Heisenberg picture, it's equally obvious that the nonzero commutators are the source of all the differences. Imagine that you consider the Kepler problem in quantum mechanics and you only change one thing: all the commutators are zero.

If that's so, the observables \(x,p\) still obey the same dynamical (=evolution in time) equations they always did – both in classical physics and in the Heisenberg picture of quantum mechanics.
\[ \frac{\rm d}{{\rm d}t} \vec x = \frac{\vec p}{m},\qquad \frac{\rm d}{{\rm d}t} \vec p = -V'(x) \] Because the commutator \([x,p]\) vanishes in our gedanken experiment, we may construct a basis of possible states such that each basis vectors is a simultaneous eigenstate of all the \(x,p\) observables. Consequently, all the observables are well-defined and they evolve according to the classical equations. The relationships between the statements about \(x,p\) in the initial state and those in the final state will be identical to those in classical physics. If you manage to measure something, you're inevitably in one of the universal common eigenstates of \(x,p\) and because nothing aside from functions of \(x,p\) may be measured in this crippled world, by assumption, you will never need another basis. You will never be able to "know" that your initial state is a nontrivial linear superposition. You will never be able to observe predictable interference patterns, either. To all the inhabitants, your world ("quantum world with vanishing commutators") will be indistinguishable from a classical one.

However, in proper realistic quantum mechanics, observables generically don't commute with each other. In particular, as simple quantities as a position (or the value of a field) and its time derivative typically have a nonzero commutator. That means that according to the Heisenberg equations, the observables \(\hat x\) etc. in the final state may still be written as various functions of \(\hat x,\hat p\) in the initial state. However, because of the nonzero commutator, you must be careful about the ordering and, what is even more important, you can no longer assume that the initial state is a simultaneous eigenstate of \(x,p\). There is no simultaneous eigenstate of operators whose commutator is \(i\hbar\). You must be much more careful how you describe the assumptions about the initial state.

Totally off-topic but funny: since the beginning of this month, I've sold 432 light bulbs that will probably be banned in 2 weeks :-)

For example, you may say that in the initial state, a component of the intrinsic angular momentum \(\hat J_x=+1/2\) which we know, probably because we actually measured it. The spin is rotating and undergoing precession according to some laws. For the sake of simplicity, assume that the spin is conserved i.e. we may another measurement before it can change substantially. Can we predict what is the final value of \(\hat J_z\)?

We won't be able to say a clear answer because (the final value of) \(\hat J_z\) isn't a function of (the initial value of) \(J_x\). They're independent observables and they don't commute with each other. What we can do is to predict probabilities. How do we predict the probability that \(J_z=+1/2\)? It's simple. For every Yes/No question, we must construct an operator whose eigenvalues are equal to 1/0 for Yes/No. For the question "is the spin \(J_z\) equal to \(+1/2\)?", the projection operator is simply
\[ P = J_z + \frac 12. \] Note that if \(J_z=-1/2\), we have \(P=0\) which means No. If \(J_z=+1/2\), we get \(P=1\) which means Yes. But if the initial state has \(J_x=+1/2\), we can't predict the clear value of \(P\), either. \(P\) fails (a better word is "refuses") to commute with \(J_x\) much like \(J_z\) does.

However, the expectation value of \(P\) is nothing else than the probability that the answer to the question hiding in \(P\) is Yes. This is true for any such projection operator. In this case, we find
\[ \langle P \rangle = \frac 12. \] Let me mention that the interpretation of the brackets is
\[ \langle P \rangle = \langle \psi | P | \psi\rangle \qquad {\rm or} \qquad {\rm Tr}\,(\rho P) \] for pure and mixed states, respectively. However, I don't really need to include any machinery for pure or mixed states at all. I may just use the \(\langle\cdots\rangle\) brackets which have the same "phenomenological" interpretation as they do in classical physics. For a set of mutually commuting operators, the probability distribution for possible values and their combinations obeys exactly the same logic as it does in the classical world, so by using the brackets, I don't necessarily introduce any new objects such as "state vectors".

At any rate, the probability that we find \(J_z=+1/2\) is 50%. For axes separated by angle \(\theta\) instead of orthogonal \(x,z\) axes, this probability would generalize to \(\cos^2(\theta/2)\). I needed to talk about expectation values because in this case, much like in the generic case, the Heisenberg equations didn't allow me to calculate a unique value of the observables in the final state out of the assumed values of some observables in the initial state. In some special cases, I will be able to derive such things: in that case, the probabilities of the outcomes are 0% or 100%, respectively. This is a special subclass of cases, of course.

Finally, the EPR experiment begins

This discussion was meant to train the dear reader in the Heisenberg picture reasoning. Every observable quantity, including measurable Yes/No questions, is associated with a Hermitian operator, i.e. one obeying \(L=L^\dagger\). Yes/No questions are associated with Hermitian projection operators obeying \(P^2=P\) as well, the so-called projection operators. If \(L=\dots\) may be derived from the assumptions about the initial state, the outcome for \(L\) is unique and unequivocal. In the generic situation, the final quantities don't commute with the initial determined ones, so only probabilities and expectation values etc. may be determined.

Now, imagine that we want to discuss an EPR entangled state of two electrons whose spins \(J_z\) are described in the ket-states and bra-states in a self-explanatory manner:
\[ \ket \psi =\frac{ \ket{{+-}} - \ket{{-+}} }{\sqrt{2}}. \] I chose a relative minus sign because this is how the singlet state of two electrons looks like. But the minus sign (or phases) won't change the probabilities of the different values of the spin. The probability is 50% that the first electron has \(J_z=+1/2\) and the second electron has \(J_z=-1/2\), and 50% that it is the other way around. At any rate, the results of the \(J_z\) measurements will be perfectly anticorrelated for the two electrons. This conclusion holds for any axis because the singlet state is rotationally invariant.

How will the description of the entangled state and the measurements look like in the Heisenberg picture? In particular, we would like to know whether the evolving operators in the Heisenberg picture behave as "hidden variables" and whether they're analogous to Bertlmann's socks, and to what extent they are. (The left and right sock of Reinhold Bertlmann has, by definition of Bertlmann, different color, a priori black or white. So you may always predict the color of the other sock if you measure one of them. But there's nothing mysterious going on here because this may be explained by an "objective" assignment of the colors prior to the measurement.)

In the Heisenberg picture, the operators are evolving according to the Heisenberg equations which have the form
\[ i\hbar \frac{{\rm d}}{{\rm d}t} L = [L,H] \] assuming that the laws of Nature don't explicitly depend on time. The commutators \([L,H]\) may be explicitly "calculated" and rewritten in terms of other (or the same) ordinary basic operators in your theory. So you may really eliminate the Hamiltonian from the picture and get back to equations that coincide with "Newton's" equations (without Lagrangians and Hamiltonians) except that all the objects in them are non-commuting operators.

We're used to discuss the EPR correlations in Schrödinger's picture because it's the favorite "misleadingly realistic" picture of the folks such as Einstein himself whose very purpose of constructing the EPR gedanken experiment was to sling mud on quantum mechanics. No surprise that almost no one discusses these matters in the Heisenberg picture. But we will do so now.

The operators including \(\sigma_1 = 2J_z^{(1)}\) and \(\sigma_2 = 2J_z^{(2)} \) have been evolving in time, according to the Heisenberg equations above. Both of them have eigenvalues \(+1,-1\); the labels \((1)\), \((2)\) identify the electrons, the left or right one, respectively (the left-or-right property is also respected by the way we write the basis kets and bras).

Our first big question is what the operators \(\sigma_1\) and \(\sigma_2\) look like before the measurement. Of course, there is some ambiguity in the answer because we may choose infinitely many different bases of the Hilbert space – it's a part of our conventions. However, one thing we must realize is that the Hilbert space for the two spins is effectively four-dimensional. So all the operators will be \((4\times 4)\) matrices.

How do we know which "state" is the real one? In the Heisenberg picture, states are not evolving in time. This picture differs from the Schrödinger picture by a time-dependent complex rotation of the Hilbert space that guarantees that the state vector is fixed. It's the operators that evolve in time. Consequently, we may assume that the state of the system is
\[ \ket\psi = (1,0,0,0)^T \] at all times. So the first column is related to the reality. We don't really need to talk about vectors in the Hilbert space at all. We may describe the state by identifying the projection operator onto it,
\[ P_{\rm reality} = \ket \psi \bra \psi = \left( \begin{array}{ccccc} 1&0&0&\cdots&0\\ 0&0&0&\cdots&0\\ 0&0&0&\cdots&0\\ \vdots&\vdots&\vdots&\ddots&\vdots\\ 0&0&0&\cdots&0 \end{array} \right) \] It only has a nonzero entry in the upper left corner! You may also call this projection operator \(\rho\), the density matrix. Things are simple for pure states. Saying that "we want to talk about the reality" is equivalent to saying \(P_{\rm reality}=1\). This "reality" condition implies many things about quantities we know, for example
\[ T_{\rm death\,\,year\,\,of\,\,Hitler} = 1945. \] Because of the special role we have attributed to the first column (or row), the matrix entry \(L_{11}\) of any operator \(L\) has a special meaning: it is the expectation value of \(L\)! The rest of the first row and the first column, i.e. the entries \(L_{1i}\) and \(L_{i1}\), contain some "but" information. If they're zero (or much much smaller than the expected eigenvalues of \(L\)), than we can treat \(L\) as a \(c\)-number: its value is sharply predicted to be \(L_{11}\). However, when the other entries in the first column and row (they're Hermitian conjugates to each other) are nonzero, we may possibly get various results of the measurement. In that case, the rest of the matrix \(L\) is important as well, for the detailed probabilistic distribution of various values. To summarize, the operator \(L\) is decomposed as
\[ L = \left(\begin{array}{cc} \langle L\rangle & {\rm but}\\{\rm but}^\dagger&{\rm dib} \end{array}\right) \] where \({\rm but}\) is a shorter row and \({\rm dib}={\rm dib}^\dagger\) is a smaller matrix containing the "details if but". This matrix is only relevant for your prediction if there are some "buts", i.e. if \({\rm but}\) is nonzero. If \({\rm but}=0\), then
\[ P_{\rm reality}=1 \Rightarrow L = P_{\rm reality} L P_{\rm reality} = \langle L \rangle. \] Nonzero "buts" prevent us from making this simplification.

I hope you're still eagerly and patiently waiting for the specific form of the operators \(\sigma_1\) and \(\sigma_2\) before the measurement of the entangled electron spins.

I want to stress one thing. Up to a unitary transformation – by a matrix \(U\) that is shared by all the operators – all the operators have exactly the same form as they always do. So it means that \(\sigma_1\), much like \(\sigma_2\), has a doubly degenerate eigenvalue \(+1\) and a doubly degenerate eigenvalue \(-1\). Moreover, all 4 possibilities must still be allowed which means that \(\sigma_1\sigma_2\) must have the eigenvalue \(+1\) twice and the eigenvalue \(-1\) twice as well. In particular, the operator equation \(\sigma_1=\sigma_2\) or \(\sigma_1=-\sigma_2\) can never hold.

How do we find a good form of the operators \(\sigma_1\) and \(\sigma_2\)? We must first decide what is our basis of the 4-dimensional Hilbert space. (That's only needed because I want to write an explicit form of the matrices; this is usually not needed to derive the physically testable predictions because they don't depend on such conventions.) It has already been decided that the first column has to be the actual singlet state of our two-electron system. We complete it to a full basis of the four-dimensional Hilbert space, for example in this way:
\[ \begin{align} \ket{e_1} &=\frac{ \ket{{+-}} - \ket{{-+}} }{\sqrt{2}} \\ \ket{e_2} &= \frac{ \ket{{+-}} + \ket{{-+}} }{\sqrt{2}} \\ \ket{e_3} & =\frac{ \ket{{++}} - \ket{{--}} }{\sqrt{2}} \\ \ket{e_4} &= \frac{ \ket{{++}} + \ket{{--}} }{\sqrt{2}} \end{align} \] With these equations, the MathJax implementation of \(\LaTeX\) on this blog starts to be a little bit meaningful. The operators \(\sigma_1,\sigma_2\) have simply evolved to the self-evident matrices expressing the operators relatively to this basis. We have
\[ \sigma_1 \left\{ \ket{e_1}, \ket{e_2}, \ket{e_3},\ket{e_4} \right\} = \left\{ \ket{e_2}, \ket{e_1}, \ket{e_4},\ket{e_3} \right\} \] I hope you understand that this is a shortcut for four equations. When acting on the basis vectors, the first term (in the definition of each basis vector) was always invariant while the other one has always switched the sign, resulting in the permutations of the 1st-and-2nd and 3rd-and-4th basis vectors. Similarly,
\[ \sigma_2 \left\{ \ket{e_1}, \ket{e_2}, \ket{e_3},\ket{e_4} \right\} = \left\{ -\ket{e_2}, -\ket{e_1}, \ket{e_4},\ket{e_3} \right\} \] That only differs by two sign flips from the action of \(\sigma_1\) on the basis vectors. The action of \(\sigma_1,\sigma_2\) on the basis vectors is easily translated to a specific form of these matrices. We have
\[ \sigma_1 = \left( \begin{array}{rrrr} 0&1&0&0\\ 1&0&0&0\\ 0&0&0&1\\ 0&0&1&0 \end{array} \right), \quad \sigma_2 = \left( \begin{array}{rrrr} 0&-1&0&0\\ -1&0&0&0\\ 0&0&0&1\\ 0&0&1&0 \end{array} \right) \] Note that these two matrices are not equal to each other; they're not equal to minus each other, either. As I have said, each of them has the doubly degenerate eigenvalue \(+1\) and doubly degenerate \(-1\) and the same thing holds for their product. They commute with each other; they have to commute because they're associated with properties of different (and, ideally, spatially separated) regions of the Universe.

You may literally imagine that you evolve the matrices in time, according to Heisenberg's equations, and this is what you end up with right before the measurement. Here, \(\sigma_1\) is purely associated with the left electron; \(\sigma_2\) with the right electron. How do we use these matrices to make predictions?

First of all, you may see that the left upper entry of both matrices is \(0\). That's the expectation value of these two observables. Because the allowed eigenvalues are \(\pm 1\), a vanishing expectation value implies that both possible outcomes are equally likely. Both \(+1\) and \(-1\) has the probability 50% for the left electron; and the same thing holds for the right electron.

The rest of the first row (and, because of Hermiticity, also the rest of the first column) is nonzero for both matrices. That means that there are "buts". The value \(0\) isn't what you're guaranteed to get; in fact, it isn't even an a priori allowed eigenvalue. Instead, you get other numbers and \(0\) is just the expectation value. You should appreciate that what you get in individual cases is up to Nature's random generator. Because the probability distribution for different values of \(\sigma_1\) can be fully calculated from the expectation values of the powers of \(\sigma_1^k\) for all integer \(k\) – that would be true even if the odds were asymmetric and if many eigenvalues existed a priori – it is very clear that the odds of various outcomes of the measurement of \(\sigma_1\) doesn't depend on anything that happens in the lab that measures \(\sigma_2\). There's clearly no signal propagating. You may consistently be interested in the properties of the first electron only – and then the values of the matrices such as \(\sigma_2\) associated with a distant region can't possibly make any difference.

However, that doesn't mean that correlations don't exist. Indeed, you may also study the correlation between the two measured spins. It is fully encoded in the observable \(\sigma_1\sigma_2=\sigma_2\sigma_1\); if this quantity is equal to \(+1\), the signs of \(\sigma_1,\sigma_2\) are the same; if this quantity is equal to \(-1\), the signs of \(\sigma_1,\sigma_2\) are opposite. I have already mentioned that these two matrices commute and have to commute; Heisenberg's equations generate just rotations of the operators induced from rotations of the Hilbert space, so they can't rotate a vanishing operator to a non-vanishing one.

It's useful to write down what the product actually is:
\[ \sigma_1\sigma_2=\sigma_2\sigma_1 = \left( \begin{array}{rrrr} -1&0&0&0\\ 0&-1&0&0\\ 0&0&1&0\\ 0&0&0&1 \end{array} \right) \] It's neither unit matrix nor minus unit matrix. But you see that the left upper corner says \(-1\), so the expectation value is \(-1\). There is a perfect anticorrelation between the measured quantities. If things commuted, then we could neglect the rest of the matrices and the upper left entry of \(\sigma_1\sigma_2\) would be the product \(0\times 0 = 0\) which would imply that the correlation is gone (as Einstein thought). However, quantum mechanics actually does force you to work with the rest of the matrices and because the rest of the first row/column is nonzero both for \(\sigma_1\) and \(\sigma_2\), i.e. they have "nonzero matrix elements between the reality and (three) fiction(s)", the left upper matrix entry of the product is more complicated and in this case, it's actually \(-1\), "perfect anticorrelation".

At the same moment, the rest of the first column (and the first row) vanishes. That means that there are no "buts". The left upper entry \(-1\) is the guaranteed result of the combined measurement. The two spins are perfectly anticorrelated. The same outcome would hold for measurements with respect to any axis, as long as we chose the same axis for both electrons. Because there are no "buts" in the first column/row, we don't really need the rest of the matrix away from the left upper entry.


As you can see, the Heisenberg picture may be viewed as a method to actually predict the measured value of any observable from the left upper entry of the corresponding matrix, the "reality-reality matrix element". However, things are only unequivocal if the rest of the first column/row vanishes i.e. if the "reality-fiction matrix elements" are zero. If that's so, the expectation value \(\langle f(L)\rangle\) of any function of the observable will still be given by \(f(L_{11})\); the rest of the matrices are separated by a block-diagonal form and will never contaminate the upper left entry.

In a generic situation, however, the rest of the first column/row will be nonzero. In that case, the left upper entry is just the expectation value of the observable and the full matrices may influence the reconstruction of the probabilities of all conceivable outcomes. Whenever a set of operators mutually commutes, you may apply classical logic to all allowed configurations of their eigenvalues. Each of the configurations either is reality or isn't reality, you may assume. However, those outcomes can't be predicted out of the known values of another set of operators that doesn't commute with the first set. In that cases, probabilities are the only thing that is calculable.

And that's the memo.

Google reminded its Czech users that painter and illustrator Josef Lada was born on December 17th, 1887. He has produced an impressive number of pictures, well beyond the ilustrations for the good soldier Švejk and tons of fairy-tales. He's connected with the typical spirit of the Czech countryside. See Google images.

No comments:

Post a Comment