Thursday, July 03, 2014

Higgs contest: the hard way to return to top 3

Now it's a good (but not stellar) moment for the Higgs ATLAS-Kaggle challenge. If you look at its leaderboard, only one minor permutation of the top 7 rankings (879 teams compete in total) has occurred in the last 7 days:

Due to a permutation of the top 2 places exactly 7 days ago, this screenshot became a bit obsolete minutes after I posted it.

And – the T.A.G. duo will surely agree – it is a small change in a good direction. ;-) And it was so hard to achieve this small change of the AMS score! What have I done?

I decided that one particular algorithm isn't good enough. It's better to write a code that simulates many programmers who are programming machine-learning algorithms and who are killing the programmers who are not good enough.

So I downloaded a Windows desktop OS emulator for Nokia Lumia 520, installed VirtualBox under this Windows system, along with Ubuntu Linux. In that system, I programmed a virtual empire that I call "The Matrix String".

This string-like landscape is a very nice environment for the programmers who live there. The inhabitants have to enjoy something that looks like an exciting life to them. Otherwise, as I realized, they don't perform too well.

Of course, their ultimate job is to write down an algorithm to optimally classify the 550,000 events in the contest. But they don't really know.

There are 220 copies of a city called Székesfehérvár – it's one of the Hungarian words I am proud to have mastered. If you have trouble with the name of the town, just call it "Stool Belgrade" which is the English translation. I am building five new copies of the town every day.

There are many T.A.G.'s hanging everywhere in the cities but I hope that they are not too important anymore! ;-) More importantly, there are numerous copies of two programmers over there. Their names are
Gábor "Neo" Melis

Morphine "Northern Lights Haze" Morpheus
They are designed to resemble the top two contestants in the contest as accurately as I could imagine them. Mr Morphine is trying to convince Gábor "Neo" Melis that he (Neo) is "the One". And make no doubts about it, I also think that Gábor Melis is "the One".

Today, in order to improve the top score by 0.006, after 10 days or so with no improvements, I had to fight against Gábor "Neo" Melis. It was tough. It seems to me that he has won again.

If I happened to win, to be eligible for the prize, I would have to reproduce the exact algorithms that generated the winning submission. So it's important to remember every motion of my hands in the fight against Melis, and so on. Weeks ago, it would have been impossible. Right now, however, it seems that I have gotten more disciplined in creating backups. So all the copies of Gábor "Neo" Melis that had to fight have a code that is saved somewhere, much like the program that determined every motion of the hands in the fight above.

As usually in the morning, I have run out of my limit of 5 submissions per day. But the new 3.76704 submission is relatively new and opens an uncharted territory so it is remotely conceivable that there exists a very minor modification of this code that improves the score sufficiently to beat the real leaders, "Neo" and "Morphine".

"Morphine" is the current leader whose AMS score is 0.03951 above mine. It's just slightly above one percent of my score. To beat him or her (there is at least one woman in the contest, Tatiana Likhomanenko is 20th after 3 submissions only, scary!), one has to improve the score by more than one percent.

It means to increase the number (well, the total weight) of true positives \(s\) by one percent while not increasing the \(b\), or to decrease the number of false positives (well, their total weight) \(b\) by two percent (because AMS is essentially \(s/\sqrt{b}\) while not lowering \(s\), or some linear combination of these options.

It may be done. Maybe.

Of course, the temporary leaderboard may be a misleading benchmark to estimate the final score which will be calculated exactly from the 450,000 test.csv collisions that are not included among the 100,000 collisions used to calculate the preliminary leaderboard. It is plausible that the "differences between AMS of two contestants" will change by 0.1 in average (root mean square) relatively to the preliminary leaderboard so it's possible everyone in the top 20 or 50 has a significant chance. I could do calculations and simulations that would clarify these matters but I think it's better to spend time on improving my (at least temporary) AMS score.

But while "the Matrix String" technology to optimize the machine learning hasn't produced a truly remarkable improvement in the preliminary AMS score, one that could beat "Neo" and "Morphine", for example, I have some reasons to think that its underlying idea is so robust that it could achieve a higher final score than other algorithms (and perhaps other contestants' algorithms).


  1. Why there are 42 contestants with 3.0006 ?

  2. Dear Fer137, it's the standard XGboost package that I would use at some point, too.


  3. Could the Matrix String technology be useful to look for some "SM" signals in THE landscape too ...?

    I know that such an "artificially intelligent" approach is not much less ugly (and uninteresting from a physics point of view) than throwing in the towel for anthropic reasons, but I am just curious :-P

    Cheers and congratulations to your return to position 3 :-)

  4. Motl Magnus! Motl Magnus!

    TRIZ: Do it the other way. An insoluble problem simply lacks a proper point of view. Copper phthalocyanine is soluble in boiling concentrated sulfuric acid. A 30 wt-% solution in solid Plexiglass is trivially achieved...using metallurgy (verified by its optical spectrum, Single molecules are different even from pairs).

    "This is not the solution we are seeking." Tough noogies.

  5. Hi - just a - probably silly question- rather than "many worlds" - could there be "many times" - i.e. more time dimensions, say one photon and one double slit experiment but in time x it goes through one slit and in time y goes through the other and the interference is the overlapping border of these time-different events?

  6. Dear Dilaton, thanks for your idea. It would be great if it were useful for that.

    The ideas - that I obviously try to cover up a little bit, but when the details are released, people will recognize that the popular description has a lot to do with the actual algorithm ;-) - are of the kind that they could be useful for almost any type of machine learning, not necessarily just physics.

    I actually don't see how your proposed application may be converted to an example of "machine learning" at all but maybe it can! ;-)

  7. Decoherence is there from beginning to end, I can't give you page number (I don't remember, I read it more than 20 years ago) but the wording Everett uses is that the "branches become effectively non-interacting/separately-evolving once they are in the thermodynamic limit" or some such thing, and he justifies it rather simply, and this is what decoherence tries to make mathematically rigorous (for no good reason, because it is completely obvious).

    This non-interacting nature is what allows classical data (the classical data stored in computers, aka, plastic deformations of the environment, measuring devices, computers, and human brains) to get called a branch-label for a quantum state. The branch-labelling is what others would call a "decoherent history description", or an "ein-selected state". Everett called it a "branch". Others call it a macroscopic superposition.

    You need to read the thesis. Really. It's not that long, and it's got real results. It also shows a tremendous talent, it's one of the best theses in history. It also speaks well of Wheeler, to get two theses of this magnitude (one out of Feynman, one out of Everett, on obviously related topics).

    The concept of "decoherence" as stated and elaborated in the 1980s was ACKNOWLEDGED by the authors to be an extension of Everett's work. People don't acknowledge prior work for no reason. Gell-Mann has patiently explained this many times, it is a fact of literature dependence, the original source of Decoherence in Everett '57.

    It is true that Gell-Mann rederived Everett's ideas for himself in 1960 or so, so what. Everett is the original. Gell-Mann's "consistent histories" is a variation on Everett, as he says both in citations and in person, in recorded interviews.

    Wigner's published his "Wigner's friend" in 1962. The trick with "Wigner's friend" is HAVING THE BALLS TO PUBLISH, of course people were thinking about this in 1935. What makes it publishable is that it creates a contrast with many-worlds--- the observer's information is treated as "conscious" and different from a computer's information. Everett treats all classical information the same, whether inside a computer or inside a human.

    Wigner didn't want to upset the Bohr, and saw an opportunity in 1962 to publish some old ideas that would never get into print. Everett was willing to upset the Bohr, and so got sacrificed in Denmark. Everett was the first to publish clearly the ideas involved.

    He was also the first to state and argue the information theoretic uncertainty principle (it's in his thesis, and it's a great result):

    I(x) + I(p) > C

    Where C is e\pi or something like this, it's what the inequality evaluates to on Gaussians. Everett showed that Gaussians are local minima of this inequality, and that there are no other local minima, strongly suggesting it is an exact inequality (it's hard to prove, it wasn't proved until 1975, by Beckner).

    where I(x) is the information in the x-distribution of psi-squared, and I(p) is the information in the p distribution of psi-squared. This is the first "correct" statement of the uncertainty principle.

  8. Here's a link to his thesis:

  9. Having access to the thesis, the discussion of decoherence is in the fifth section, after the presentation of the interpretation. It begins with the Heisenberg-like analysis of the H-atom on page 86, and the discussion is particularly relevant toward page 99.

    The relevant discussion of the information theoretic Everett uncertainty principle is on page 129.

  10. Dear Ron, the claim "the insight is there from the beginning to the end" is just like some new-age religion or global warming that also has "evidence of a looming catastrophe everywhere where you look". This is meant to convince a gullible, confused person to jump with his eyes from one place to another so that he doesn't look at anything carefully. If one actually looks carefully, there is no evidence whatsoever.

    The proposition that "parts of the system then evolve de facto independently" is equally valid for phase transitions in classical physics where the phase space for many molecules gets disconnected into different basins with different phases, and it's "valid" in dozens of other completely different settings to express completely different ideas. You surely don't want to say that such a claim clarifies any of the subtle issues in quantum mechanics, do you?

    Moreover, the word "non-interacting" for the branches is totally and fundamentally wrong, and Everett should be fired from school just for this blunder. Terms in the wave function are *always* non-interacting. Interactions are something completely different - they're the dependence of the Hamiltonian on two degrees of freedom in a nonlinear way, like psi.psi.A in QED. There is never any interaction between "branches" of the wave functions, before or during or after decoherence, never. The correct thing would be to talk about interference, of the branches, right? You see that Everett himself was already confused about the totally basic things in quantum physics such as the difference between interference and interactions, and all the deluded Everettian have the same trash in their skulls.

    The actual rule is that quantum mechanics never leads to such a separation's being exact so the sentence you said is really wrong because it completely distorts the fundamental status of the rules of quantum mechanics how they behave in Nature. To be exact about the origin of such approximations, what is approximate and what is exact, is damn important exactly when one is trying to put the foundations on a firm ground, isn't it?

    As I said, many people talking about "decoherence" say lots of complete bullšit, too, so your connections between some decoherence people's citations of Everett don't help to argue that any of this makes any sense.

    The point is that quantum mechanics with the exact rules as formulated in Copenhagen is right and also correctly deals with the classical-quantum boundary and its (non)reality and (non)objectivity. Quantum mechanics never leads to an exact separation of "worlds", to an exact splitting into worlds that don't interfere with each other at all in the future, but that it does allow one to derive classical observers approximately, and those are needed to make firm, classically sounding statements about measured observables.

  11. Blah blah blah. Read the paper, it's linked. Everett doesn't use the words you didn't like, and OBVIOUSLY non-interacting in this context means effectively non-interfering, and OBVIOUSLY it is only asymptotically valid in the classical limit.

    I gave more specific spot citations, but really, just read the fucking paper. It's a classic, it's worth it, and it's well written besides.

  12. I've posted this link many times on this blog and elsewhere, and read it.

  13. I've read the paper and told you about that about 3 times already. I won't read it again,it's rubbish, and please don't use excessive capitalization in the comments.

  14. wow , I'm not a physicist but i think i understand quantum mechanics enough to know Everett is pissing on whole foundation of quantum mechanics in his thesis . he first claim a mess and nonexistent problem in quantum mechanics (copenhagen interpretation) then suggest a solution for it (many worlds interpretation ) that is polar opposite of quantum mechanics .I'm shocked that physicists even mention Everett's thesis as a interpretation of quantum mechanics.
    Everett's thesis is not a interpretation of quantum mechanics it is a anti-quantum mechanics propaganda that any crackpot coward that don't like quantum mechanics and don't want to pay price for bashing quantum mechanics hide behind it.