Monday, January 28, 2019

A fun simple problem in quantum computing

Before the lunch, Hallyu Website provoked me to solve a neat straightforward optimization exercise in quantum mechanics which could be a nice exercise in QM courses.

After very many factor-of-two errors were fixed (I like Feynman's methodology to do science which starts with "guessing the right results" and it sometimes doesn't work immediately LOL), some of which made a huge impact on the result LOL, I hopefully got the correct solution at the Quantum Computing Stackexchange.

Jjbid asked:

Consider the following game:

I flip a fair coin, and depending on the outcome (either heads/tails), I'll give you one of the following states:\[

|0\rangle \text{ or } \cos(x)|0\rangle + \sin(x)|1\rangle.

\] Here, \(x\) is a known constant angle. But, I don't tell you which state I give you.

How can I describe a measurement procedure (i.e. an orthonormal qubit basis) to guess which state I'm given, while maximizing the chance of being right? Is there an optimal solution?

I've been self-studying quantum computing, and I came across this exercise. I don't really know how to even start, and I would really appreciate some help.

I think that a good strategy would be to perform an orthogonal transformation with \[

\begin{bmatrix}
\cos(x) & -\sin(\theta)\\
\sin(x) & \cos(\theta)
\end{bmatrix}.

\] Can't make much progress...



Your humble correspondent answered:

We simply translate the binary result of a qubit measurement to our guess whether it's the first state or the second, calculate the probability of success for every possible measurement of the qubit, and then find the maximum of a function of two variables (on the two-sphere).



First, something that we won't really need, the precise description of the state. The full state of the system that depends both on superpositions a well as a classical fair coin may be encoded in the density matrix\[

\rho = \frac 12 \pmatrix{1&0\\0&0} + \frac 12 \pmatrix{\cos^2x &\sin x \cos x\\ \sin x\cos x & \sin^2 x}

\] where the left column and upper row corresponds to the basis state "zero" and the remaining ones to "one". It's helpful to rewrite the density matrix in terms of the 4-element basis of the \(2\times 2\) matrices,\[

\rho = \frac 12+ \frac{\sin x \cos x}{2} \sigma_x + \left(\frac{\cos^2 x - \sin^2 x}{4}+\frac 14\right) \sigma_z

\] That may be written in terms of the angle \(2x\)\[

\rho = \frac 12 + \frac {\sin 2x}{4} \sigma_x + \frac{\cos 2x +1}{4} \sigma_z

\] Now, regardless of the mixed state, this is still a two-level system and all measurements on the two-dimensional Hilbert space are either trivial (measurements of a \(c\)-number) or equivalent to the measurement of the spin along an axis, i.e. measurements of \[

V = \vec n \cdot \vec \sigma

\] which is a unit 3D vector multiplied by the vector of Pauli matrices. OK, what happens if we measure \(V\)? The eigenvalues of \(V\) are plus one or minus one. The probability of each may be obtained from the expectation value of \(V\) which is\[

\langle V \rangle = {\rm Tr} (V \rho)

\] The traces of products only contribute if \(1\) meets \(1\) (but we assume there was no term in \(V\)) or \(\sigma_x\) meets \(\sigma_x\) etc., in which cases the trace of the matrix gives an extra factor of two. So we have\[

\langle V \rangle = \frac{\sin 2x}{2}n_x + \frac{\cos 2x +1}{2} n_z

\] We get the eigenvalue \(\pm 1\) with the probabilities \((1\pm\langle V \rangle) / 2\), respectively. Exactly when \(\cos x = 0\), the two initial "head and tail" states are orthogonal to one another (basically \(|0\rangle\) and \(|1\rangle\)) and we may fully discriminate them. In that simple case, to make the probabilities equal to \(0,1\), we must simply choose the measurement along the \(z\)-axis i.e. \(\vec n=(0,0,\pm 1)\); note that the overall sign of \(\vec n\) doesn't matter for the procedure, the two results just get interchanged.

Now, for \(\cos x \neq 0\), the states are non-orthogonal i.e. "not mutually exclusive" in the quantum sense and we can't measure directly whether the coin was tails or heads because those possibilities were mixed in the density matrix. In fact, the density matrix contains all probabilities of all measurements, so if we could get the same density matrix by a different mixture of possible states from coin tosses, the states of the qubit would be strictly indistinguishable.

Our probability of success will be below 100% if \(\cos x\neq 0\). But the only meaningful way to use the classical bit \(V=\pm 1\) from the measurement is to directly translate it to our guess about the initial state. Without a loss of generality, our translation may be chosen to be\[

(V = +1) \to |i\rangle = |0\rangle \\

\] and \[

(V = -1) \to |i\rangle = \cos x |0\rangle + \sin x |1 \rangle.

\] If we wanted the opposite, cross-identification of the heads-tails and the signs of \(V\), we could simply achieve it by flipping the overall sign of \(\vec n \to -\vec n\).

Let's call the first simple initial state "heads" (the zero) and the second harder one "tails" (the cosine-sine superposition). The probability of success is, given our translation from \(+1\) to heads and \(-1\) to tails,\[

P_{\rm success} = P(H) P(+1|H) + P(T) P(-1|T).

\] Because it's a fair coin, the two factors included above are \(P(H)=P(T)=1/2\). The most difficult calculation among the four probabilities is \(P(-1|T)\). But we have already made a harder calculation above, it was the \((1-\langle V\rangle) / 2\). Here we just omit the constant term proportional to \(n_z\) and multiply by two:\[

P(-1|T ) = \frac 12 - \sin 2x \frac{ n_x}2 - \cos 2x \frac{ n_z}2

\] The result for "heads" is simply obtained by setting \(x=0\) because the "heads" state equals "tails" states with \(x=0\) substituted. So \[

P(-1|H) = \frac{1-n_z}{2}

\] and the complementary \(1-P\) probability is\[

P(+1|H) = \frac{1+n_z}{2}

\] Substitute those results to our "success probability" to get\[

P_{\rm success} = \frac{1+n_z +1 - (\sin 2x)n_x - (\cos 2x)n_z}{4}

\] or\[

P_{\rm success} = \frac 12 - \frac{n_x}4 \sin 2x + \frac{n_z}{4} (1-\cos 2x )

\] If we define \((n_x,n_y,n_z)=(-\cos \alpha,0,-\sin\alpha)\), we may also write it as\[

\eq{
P_{\rm success} &= \frac 12 +\frac{\sin(2x+\alpha)-\sin \alpha}{4} =\\
&=\frac 12+\frac{\sin x \cos(x+\alpha)}{2}
}

\] We want to maximize that over \(\alpha\). Clearly, the maximum is for \(\cos(x+\alpha)=\pm 1\) where the sign agrees with that of \(\sin x\) i.e. \(\alpha=-x\) or \(\alpha=\pi -x \) and the value at this maximum is\[

P_{\rm success} = \frac{1+|\sin x|}{2}

\] which sits in the interval 50% and 100%. The same angle \(x\) is used to pick the "optimal measurement" even for \(\alpha\) encoding the vector \(\vec n\) which indicates that Jjbid's "good strategy" was the perfectly correct intuition. As you may have noticed, I am using the trigonometric identities for sines or cosines of sums, differences, and double angles all the time.

That's a nice measurement which is really quantum mechanical. We use a different measurement than that of \(\sigma_z\), i.e. the classical measurement of the bit associated with the vector \(\vec n = (0,0,\pm 1)\). Instead, we measure the spin along the axis in the \(xz\)-plane that is defined by the same nonzero angle as the angle \(x\) at the beginning, with some correct signs and shifts by multiples of \(\pi/2\).

Jjbid and others could prefer if I said what is the basis of post-measurement eigenstates but that's a very bad habit. You should always describe the measurements in terms of the actual observables i.e. matrices or operators. The eigenstates may be calculated from these observables in a straightforward way. It's more natural to talk about observables and it's usually nicer to do the algebra with them, too! Physics is about observables, not "states".



The blue, quantum success rate is superior over the purple, classical one.

Note that if you measured simply \(\sigma_z\), the classical bit, the success rate would be just \[

P_{\rm success,classical} = \frac{3-\cos 2x}{4},

\] which I got by substituting \(\vec n = (0,0,\pm 1)\) into our general formula, also between 50% and 100%, but strictly smaller than our result for every \(x\) (except when both results are 50% or both are 100%). In particular, for a small \(x=0+\epsilon\), our optimal result would be Taylor-expanded as \(1/2+|x|/2\) while the non-optimum result using the classical measurement would increase above \(1/2\) more slowly, as \(1/2+x^2/2\).

This "more rapid" quantum increase of the success rate (linear, not quadratic as in classical physics) is analogous to the faster speeds of particles on their typical quantum trajectories, to repulsion of eigenstates, and other things. Quantum mechanics is also naturally capable of producing the absolute value in the results, e.g. in our \((1+|\sin x|)/2\).

For many hours, a wrong answer of mine (including a mistake in the final portions) was posted on that server, despite the fact that I had previously fixed many wrong factors of two. I have previously claimed that the optimum measurement was the simple classical ones which is not the case. The solution to this problem does require a measurement that cannot be made in classical physics (for classical bits).



Appendix

It may be useful to explicitly write the operator we measure and eigenstates, to compare the correct result to the intuition of Jjbid and others (who want to think about "states"). I wrote that we defined \((n_x,n_y,n_z)=(-\cos \alpha,0,-\sin\alpha)\) and \(\alpha=-x\) or \(\alpha=\pi-x\) for positive and negative \(\sin x\), respectively. That means\[

\vec n = (-\cos x, 0, \sin x) \cdot \epsilon(\sin x)

\] where epsilon is the sign function. Up to an arbitrary normalization factor, the eigenstates with the eigenvalues \(\pm 1\) of \[

V=\vec n \cdot \vec \sigma=\pmatrix{\sin x&- \cos x\\ -\cos x& -\sin x} \epsilon(\sin x)

\] are real and \[

\vec e_+ = \pmatrix{ +\sin(x/2+\pi/4) \\ -\cos(x/2+\pi/4) },\\
\vec e_- = \pmatrix{ +\cos(x/2+\pi/4) \\ +\sin(x/2+\pi/4) },\\

\] where these two vectors have to be interchanged for \(\sin x \leq 0\).

The factor of \(1/2\) in the angles inside the eigenvectors comes from the fact that the \(V\) is a spin-one operator while the 2-component wave function is spin-1/2, so it rotates twice as slowly. An alternative explanation is that the optimum eigenvectors are "in between" the "heads" state that is rotated by zero and "tails" that is rotated by \(x\) which is why the angle is halved. This is a factor of \(1/2\) that Jjbid's intuition probably overlooked during his desire to "guess" the result too naively. He may have overlooked the need to shift the argument by \(\pi/4\), too – that shift is because our optimum vectors are in between the angles \(0\) i.e. "heads" and \(x+\pi/2\), the latter is the orthogonal vector to the "tails". Note that the shifted sines and cosines by 45 degrees may also be written as \((\sin x/2 \pm \cos x/2) / \sqrt 2\).

Note that it's more concise to describe the optimum measurement by the operator in the Pauli matrix basis than by its eigenvectors.

No comments:

Post a Comment