Monday, January 20, 2014

Eta function and the sum of positive integers

In the previous articles, I described various calculations of the sum of positive integers and discussed the context in which it is right to say that the sum is equal to \(-1/12\). Here, I want to sketch a justification of the value that is perhaps more morally justifiable than (even) the zeta-function regularization and other regularizations – a derivation that is rooted in the symmetry transformations of some physical quantities.

The derivation will be based on the properties of \(\vartheta\)-functions (vartheta functions), especially the Dedekind \(\eta\)-function (eta function). I recommend you some mathematics pages about these functions and identities they obey or some string theory textbook, e.g. Joe Polchinski's "String Theory", especially pages 209-216 of Volume I.

It's a good idea for the reader to reread the blog post on the shapes of the torus and the modular group. We will be interested in the partition sum of a massless free field – e.g. the bosonic Klein-Gordon field – computed on a torus whose shape is given by the complex number \(\tau\). It means that it is a torus obtained by the identifications of the points in the complex plane\[

0 \equiv 2\pi \equiv 2\pi\tau \equiv 2\pi m + 2\pi \tau n.

\] The simplest square-shaped torus has \(\tau=i\). We will see that the summation of the zero-point energies – which is proportional to the sum of positive integers – has to be set equal to \(-1/12\) for the most self-evident symmetries of our definition of the torus to be preserved (or at least maximally preserved).

Fine. So the Euclidean action of the Klein-Gordon field \(X\) on the two-dimensional Euclidean manifold is something like\[

S = \int \dd^2 \sigma \zzav{ \frac 12 (\partial_1 X)^2 + \frac 12 (\partial_2 X)^2 }

\] I will be sloppy about some normalization factors, especially those that are not important for the key result we want to understand and that could be distracting, but if you want to seriously recheck everything, you should try to focus on them, too. You may also imagine that the action contains some counterterms that are "needed" for the finiteness, so the full action may be thought of as \[

S_{\rm full} = S + S_0^{\text{counterterms needed to make the theory nice}}

\] In that case, we are computing the sum \[

1+2+3+4+5+\dots + C_\text{counterterms to make the sum nice}

\] and this sum is equal to \(-1/12\). You may imagine that the counterparts are "infinite" i.e. "nonzero" but their being "morally zero" is reflected by the fact that even if you assume that they're infinite, they're still independent of all the independent variables yet dimensionful – and among the finite numbers, only zero has this property.

Fine. Place the simple theory above on the torus whose shape is given by \(\tau\) as explained above and try to calculate the partition sum via the Euclidean Feynman path integral\[

Z (\tau) = \int {\mathcal D}X(\sigma^1,\sigma^2)\,\exp(-S).

\] You may calculate this partition sum in the operator formalism. Well, this Euclidean path integral with the periodic time coordinate \(\sigma^2\) is nothing else than the "thermal partition function" which may be computed as\[

Z(\tau) ={\rm Tr}\zzav{ \exp(2\pi i \tau_1 P -2\pi \tau_2 H) }

\] We're computing the trace (because of the periodic identification of the Euclidean time) of the time-evolution operator by the correct time \(\tau_2\) which is achieved by exponentiating the Hamiltonian \(H\) – but we must also insert the spatial translation by \(\tau_1\) which is achieved by the exponential of the multiple of \(P\). The insertions of \(2\pi\) are conventions that are widespread in string theory and they're not important for the main result.

The Hamiltonian is one for the spatial section of the physical system, e.g. \(\sigma^2={\rm const}\), and this slice is nothing else than the closed string (closed because of the spatial periodicity in \(\sigma^1\)). And if you expand \(X(\sigma^1)\) into Fourier modes, you will see that the Hamiltonian derived from the action or Lagrangian above which is just the usual Hamiltonian for a string, \[

\int \dd \sigma^1 \zav{ \frac 12 P(\sigma^1)^2+ \frac 12 X'(\sigma^1)^2 }

\] is equivalent to the sum of infinitely many harmonic oscillators labeled by the positive integer \(n\):\[

H = \sum_{n=1}^\infty \, n (a^\dagger_n a_n+1/2)

\] More precisely, you will get two oscillators for each positive \(n\), one describing the left-moving modes and one for right-moving modes, and the partition sum also involves the integral over the zero mode of \(X\) that "couples" the left-movers and right-movers in some way. However, the dependence on \(\tau\) due to the zero modes must be "simple" (at most a power law in \(\tau\), I don't want to go into it) and the partition sum over the remaining degrees of freedom is inevitably factorized into the left-moving and right-moving part and both must obey the symmetry pretty much separately.

I have normalized the Hamiltonian for the oscillator labeled by the positive integer \(n\) so that the spacing of the energy eigenvalues is \(n\). It is surely proportional to \(n\), as you can see from the Fourier expansion, and each "totally overall" normalization of the Hamiltonian is equally good for calculating the sum of positive integers.

What is the partition sum of all the left-moving degrees of freedom (harmonic oscillators) labeled by positive integers \(n\)? For each value of \(n\), we get a harmonic oscillator whose partition sum is the geometric series\[

1+\exp(-\beta E_{\rm spacing}) + \exp(-2\beta E_{\rm spacing})+\dots =\\
= \frac{1}{ 1-\exp( -\beta E_{\rm spacing} ) }=\dots

\] and we see that the \(\beta\sim \tau_2\) and \(E_{\rm spacing}=n\) with some normalizations here. Well, the separation into the left-movers and right-movers actually disentangles factors that (holomorphically) depend on \(\tau\) and those that depend on the complex conjugate \(\bar \tau\). The relevant partition sum of one of the left-moving harmonic oscillators – the geometric series above – is\[

\dots = \frac{1}{1 - q^n} \rightarrow \dots

\] where \(q=\exp(2\pi i \tau)\). This \(q\) is still the same thing as the \(\exp(-\beta E_{\rm spacing})\) that you know from more general, low-brow conventions and notations. It's important that this \(q\) exponentially depends on the periodicity (Euclidean time) \(\tau_2\).

We should also add a general zero-point (ground state) energy of each harmonic oscillator which is \(n/2\) which means the following fix of the previous result\[

\dots \rightarrow \frac{q^{n/2}}{1-q^n}

\] Now, the partition sum of all the harmonic oscillators is simply the product of the partition sums of each oscillator (because they're independent physical systems and the partition sum is multiplicative because the action is additive):\[

Z(\tau) = \prod_{n=1}^\infty \frac{q^{n/2}}{1-q^n} = q^{(1+2+3+\dots)/2} \prod_{n=1}^\infty \frac{1}{1-q^n}

\] That's great because the sum of integers (one-half of it) appeared in the exponent above \(q\). The remaining infinite product is clearly well-defined e.g. if \(q\to i\infty\), if the imaginary part of \(q\) is very large and positive, because \(\exp(2\pi i q)\) is tiny in that case and we essentially deal with the product of numbers that are mostly "very close" to one and such a product is well-defined.

I want to assure you that if you were more careful about some normalization factors and/or if you used other normalization conventions, there would be various factors in the equation defining \(Z(\tau)\) above but it would still be true that a multiple of \((1+2+3+4+\dots)\) would appear in an exponent. OK? It's important that we haven't screwed the relative normalization of the exponents. If the denominator in the infinite product contains \((1-q^n)\), then it is right that we have the prefactor \(q^{(1+2+3+\dots)/2}\) because the zero-point energy of the harmonic oscillator is \(\hbar\omega/2 = E_{\rm spacing}/2\).

Again, let me remind you that we have only dealt with the holomorphic dependence on \(\tau\) because we were calculating the partition sum from the left-moving quantized waves on the string only. The whole partition function is a product of left-movers and right-movers with some "simple adjusting factors" from the zero modes that "couple" the left-movers with the right-movers.

Great. Note that the infinite product we have here is known as the Euler function:\[

\phi(q)=\prod_{k=1}^\infty (1-q^k).

\] It is no coincidence that Euler managed to play with similar functions centuries ago – the same Euler who has found another, simple method to prove that the sum of positive integers equals \(-1/12\). This early string theorist has understood very many things, indeed. How do we determine the sum of positive integers?

The key observation is that the partition sum for the "inverse" shape of the torus, \[

\tau \to -\frac{1}{\tau},

\] i.e. the torus obtained from the original one by exchanging the two sides of the defining rhomboid or rotating the rhomboid by roughly 90 degrees, if you wish, should be "pretty much" the same thing, i.e.\[

Z(\tau)\sim Z(-1/ \tau)

\] I didn't quite write "equals" because there may be some need for "simple" normalization factors because the zero modes of \(X\) have to be treated differently in the original and rotated torus. However, the ratio \(Z(\tau)/Z(-1/\tau)\) shouldn't be "complicated" – it shouldn't contain any functions that exponentially depend on \(\tau\) or that depend on \(\tau\) even in more complex ways. At most, the ratio may be a power law in \(\tau\).

A two-dimensional torus may be obtained from a rectangle (or a square) if we identify the upper edge with the lower one; and the left edge with the right edge. So much like in PacMan-like games, you reappear on the opposite side of the screen if you try to escape from it ("periodic boundary conditions"). The first identification turns the rectangle into a cylinder; the second one bends it into a doughnut. In the usual \(Z(\tau)\), we are interpreting the direction B as "time"; but we may also calculate \(Z(-1/\tau)\) from the same rectangle/torus in which case the cycle A is the "time" and the roles of the cycles A,B of the "otherwise identical torus" are interchanged. Naively, you would think that \(Z(\tau)=Z(-1/\tau)\) because both partition sums are path integrals ("sums") of \(\exp(-S)\) over the "same" set of configurations. However, this relationship only holds if the zero-point energies are correctly summed.

OK, I am going to tell you what is the right way to obey the condition above. If we agree that the right value is\[

1+2+3+4+5+\dots = -\frac{1}{12},

\] then we reduce\[

Z(\tau) =\eta(\tau)^{-1} := q^{-1/24} \prod_{n=1}^{\infty} \frac{1}{1-q^n}

\] and this "Dedekind eta function" obeys\[

\eta(-1/\tau ) = (-i\tau)^{1/2} \eta(\tau)

\] which expresses the sort of "a simple enough relationship" we were trying to get – a very similar relationship will hold for \(Z(\tau)\), too. The exponent of \(q\) in the definition of \(\eta(\tau)\) – we wrote the inverse of this function above – is exactly equal to \(1/24\) which is \(-1/2\) times the sum of positive integers.

The Dedekind eta function is very pretty. The real part of \(\eta(\tau)^{24}\) is depicted by colors in the \(q=\exp(2\pi i \tau)\) plane in the picture above.

If we decided that the sum of positive integers is something else than \(-1/12\), our partition function would be \(z(\tau)\) with \[

Z(\tau) = z(\tau) q^K

\] However, with this substitution, the formula for \(z(\tau)/z(-1/\tau)\) would get extra factors of the kind \(q^K\) as well as \(\exp(-2\pi i K/ \tau)\) – very complicated functions, especially the second one – and such complicated functions in the ratio couldn't be explained by the different treatment of the zero modes. The naively unquestionable "switching the two sides" symmetry of the torus would be entirely broken. Only if you define the sum of positive integers (or the sum of positive integers plus the only meaningful "curing" counterterm in physics, if you wish) as \(-1/12\), the partition sum has the symmetry it should have.

I didn't make the point of the symmetry under the modular group sufficiently clear above – the text focused on the functions and mathematical identities they obey. But let me emphasize the obvious point that the classical dynamics – and classical action – is completely identical for the Klein-Gordon field on the two tori because they're really the same torus (up to an overall scaling but the theory is scale-invariant, anyway). The very fact that we may get \[

Z(\tau)\neq Z(-1/\tau)

\] implies that there are "anomalies" – that the naive symmetry of the partition sum that would hold in classical physics is "broken" as soon as we switch to the quantum version of the dynamics. Sometimes, the anomalies are there whatever you do. For example, if you omitted the quarks but not leptons (or vice versa) from the Standard Model, the theory would be anomalous and the gauge symmetries that seem to be symmetries of the classical action would be inevitably broken.

However, the symmetries may often be preserved by the quantum theory – there exists a theory obeying the postulates of quantum mechanics that has the desired classical limit and that also preserves many/all symmetries of the classical theory. Whenever some symmetries may be preserved, we should try to preserve them and consider the quantum theory that maximizes the symmetries (or their maximum subset) to be "the" quantum counterpart of the classical theory. After all, the unbroken gauge symmetries are vital for any quantum field theory with gauge fields to remain consistent (the gauge symmetry is needed to segregate the ill negative-norm time-like polarizations of the gauge fields). The modular group in string theory (the symmetry involving the rotations of the torus) is equally essential for the consistency of string theory as it prevents us from an infinite multiple counting and UV divergences in string theory.

So this symmetry under \(\tau\to -1/\tau\) must hold in the consistent quantum theory as well and the assignment of the value \[

1+2+3+4+5+\dots = -\frac{1}{12}

\] is an essential part of the choices we must make for the theory to have a chance to be consistent. When other things are arranged correctly so that other anomalies (e.g. conformal anomaly) cancel as well, we may prove that we get a fully consistent theory, and that's what we mean by string theory. In the fully perturbatively consistent theory, the total spacetime dimension is fully dictated, e.g. \(D=26\) for bosonic string theory, and the spacetime equations of motion (e.g. Einstein-Maxwell-Klein-Gordon-Dirac coupled field equations) have to obeyed as well.

We may calculate many more sums of this kind. The sums of fixed powers of integers (or integers shifted by a fixed shift) with or without alternating signs make up a group of expressions that have unique, finite values. I must mention the only exception. The sum\[

1+ \frac 12 + \frac 13 + \frac 14 + \dots = \infty

\] is really divergent, in the same sense as the ratio \(1/0\). This sum is divergent even though the partial sums diverge "just" logarithmically, i.e. very slowly. This sum is equal to \(\zeta(1)\) and the Riemann zeta function \(\zeta(s)\) only has a single singularity in the whole complex plane which happens to sit at \(s=1\). The power-law divergences look worse than the logarithmic ones but if you look at the situation with some wisdom, the logarithmic divergences actually are more divergent than the power-law ones (a fact that has a clear meaning e.g. in the dimensional regularization).

The right value of the sum of integers may be derived from the partition function for free bosons, free fermions, various periodicities, and so on. It's always \(-1/12\). This value implies that the critical dimension of bosonic string theory is equal to\[

D = 2 - \frac{2}{1+2+3+\dots} = 26

\] and similar calculations combining several terms of this sort also imply that \(D=10\) for the superstring. We may say that aside from the critical dimensions, all the mathematical patterns of perturbative stringy expressions that seem to have denominators \(\pm 1/12\) or \(\pm 1/24\) or \(\pm 1/48\) (and there are very many such factors in the stringy expressions!) pretty much arise from the same profound fact – the sum of positive integers should be assigned the value \(-1/12\). However, this profound truth reigns not only in string theory but in any theory where some free fields periodically depend on two dimensions. That's why one may verify that the sum equals \(-1/12\) even in QED, by measuring the Casimir force between two plates. It's really an important insight in all of physics and all approaches to mathematics of functions that wants to respect the same kind of "deep mathematical wisdom and elegance" that is exhibited by Nature through quantum field theory and string theory.

With some mundane definitions of infinite sums etc., you may end up with the conclusion that the sum diverges and isn't terribly interesting or structured. You may pick your axioms in any way you want. But it's important to point out that if you don't understand why the value \(-1/12\) is the most natural one to be assigned to the sum, you're overlooking lots of fundamental wisdom that is used in physics as well as in many mathematical identities involving some very important functions.

It's at least morally correct to say that the sum of positive integers is \(-1/12\).

And that's the memo.

P.S.: I really encourage you to use your favorite symbolic calculator and verify the identity which is obeyed by \(\eta(\tau)\). The Mathematica commands and output doing the job look like this:
q[tau_] := Exp[2*Pi*I*tau];
eta[tau_] := q[tau]^(1/24)*Product[1 - q[tau]^m, {m, 1, 99}]

tautest = -0.27 + 1.13*I


-0.27 + 1.13 I
0.20003 + 0.837161 I
0.801025 + 0.0379993 I
0.801025 + 0.0379993 I
They match perfectly. It works for any value of "tautest" in the upper half-plane. I didn't rely on any predefined advanced functions. We may also employ a "friendly cooperation between physics and mathematics" to discover and prove the identity. Because the partition function on the two related tori should be "formally equal", we may guess that a similar identity could hold "with some minor corrections" and find the corrections by numerical guesses. When we have the successfully tested conjecture, we may prove it rigorously as well, of course. The values like \(-1/12\) or \(+1/24\) are essential parts of the identities – the "corrections" of the classical intuition that quantum mechanics made unavoidable.


  1. So, let me see if I understand what's going on. You want a consistent string theory, but at various places this requires the infinite sum of all the positive integers, and the only way to get consistency is to define the sum to be -1/12.

    You then introduce various "arguments" to suggest this definition is acceptable mathematically.

    Is that it?

    I have no objection to this, but it suggests that -1/12 is an empirical constant like the acceleration due to gravity.

  2. I am no expert. After reading some passages in Zee's QFT book I think I have understood it. Basically we get the sum of integers, insert exponential damping factors such that in the limit of parameter 'a' vanishing, we recover the original series of integers. Working in this limit, we obtain the force on the 2 plates which is a finite quantity plus some higher order terms of 'a'. Then setting 'a' to 0 we get a finite answer that matches exactly with experiments. Am I right so far? The only thing that disturbs me is the fact that the we had to insert the damping factors, why were they not present originally? Isn't it a defect of the theory that does not provide for the damping factor and the vanishing parameter?

  3. Dear Bob, you surely realize that the last paragraph of yours is self-evident nonsense, don't you?

    It can't be an "empirical constant" if it can be derived without any empirical input - from consistency considerations themselves.

  4. I downvoted it because I just hate feedback like yours. It's so annoying, obnoxious, idiotic. You literally boast that you haven't even tried to read my blog post.

    Instead, you reproduce one of the simplest methods - the first one fully described in this 2011 text

    and completely miss the whole depth of all of this, the diversity of methods with which the result may be derived, as well - of course - the particular method that was described in this blog entry.

    Otherwise your comment about the "defect" is totally idiotic, too. The theory - like the CFT behind perturbative string theory - *is* the collection of all the correlation functions and/or scattering amplitudes. They're completely fixed, rock-solid, determined, and no damping parameter is needed for anything like that. The damping regularization of the sum is just one method by which humans may approach the sum.

    But the sum 1+2+3+4+5+.... is equal to -1/12 even without the damping - and the correlators computed by a legit regularization technique (any, one of many choices) obey the axioms of CFT even though these correlators don't depend on any cutoffs. It's clear that you're still not getting it.

  5. kashyap vasavadaJan 20, 2014, 7:34:00 PM

    Hi Lubos:

    These blogs on apparently “divergent series “are very educational. Now I also understand how this is useful in Casimir effect and QED. How about QCD? What I have heard so far is that in QCD at low energies, perturbation series diverges; people give up and resort to numerical lattice calculations. My question is: is there application of these ideas to QCD?

  6. Dear Kashyap, this sum of integers is relevant for QED because it's essentially a "free theory" in the relevant context - free fields, therefore simple terms like "n".

    The whole Casimir energy appears as a one-loop or semiclassical effect, kind of, so you're not summing multiloop diagrams at all.

    QCD is highly interacting so the terms are not as simple as "n" in any expansion which is why the sum of integers isn't relevant there.

    You say that QCD's perturbative expansions diverge. Right, but that's the case for any perturbative quantum field theory (including QED) as well as perturbative string theory. The divergence of the series doesn't mean that the exact function doesn't exist. It does - and the expansion is an asymptotic series. The minimal error you get by resumming the asymptotic series up to the smallest term (when it starts to diverge) is exp(-C/g) or exp(-C/g^2) or so, and of the same order as the leading nonperturbative corrections.

    This (asymptotic series) is a completely different class of questions than the resumming of a particular series and it has been discussed e.g. in

  7. Those who trust in Mathematica more than in Lubos (:-O ) or myself ( :-( ) can evaluate
    Sum[n, {n, 1, Infinity}, Regularization -> "Dirichlet"]

    in Mathematica 9 and see what happens ;-) .

  8. Oh, that. K phthalimide in cold DMF on an alpha-bromo ketone. Instead of displacing bromide it added to the carbonyl. The resulting hemiaminal alkoxide displaced bromide, closing to the epoxide. Product NMR ruined my lunch break. Ammonia/ammonium buffer added to the slightly protonated epoxide at the unhindered end. Aqueous acid workup freed the carbonyl while protecting the amine. Sometimes ya gotta dance.

    Perform a geometric Eötvös experiment testing spacetime geometry: left-handed vs. right-handed single crystal alpha-quartz test masses . Physics describes an aspirin, chemistry makes one. String theory is all lomcovák in application. You should take two, the other in gamma-glycine enantiomorphs. Newton was a blast after c and h ceased being approximations.

  9. With all due respect, I sometimes wonder if your are cooking something more than just plain epoxide kind of stuff in your lab ;-)

    Reading your stuff is somewhat frustrating activity because it contains too much jargon, conclusions that I won't get without further explanations, weird trail of thoughts etc. I bet that you have some kind of message in your comments but I'm not getting it :-( Can you express yourself in more clear fashion?

  10. Fair enough. I should probably have taken more time to understand what you wrote in this article. Indeed I will do so now. Anyway thanks for the feedback. And I do apologize for offending you. That was not the intention. I never did claim that the theory was 'defective' only that it 'seemed' defective from my limited point of view.

  11. Wow, this nice derivation of the 1+2+3+...=-1/12 is now my favorite one ;-)

    Not sure if I remember this correctly, but the partition sum business seems also to contain the nice physics justification of some mathematically obvious manipulations Lenny Susskind applied to derive the critical dimension of bosonic ST.

  12. Thanks and nice to see you here, Dilaton, I was a bit worried!

  13. As is common, a reaction failed to give a textbook intermediate but upon workup gave the same final product, which is not so common, diagramed here: