Update 6/15: After several days, I returned to top three out of the 656 competitors (or teams). 3.74428 would be enough to lead a week ago but times are changing. We are dangerously approaching the 3.8 territory at which I am likely to lose a $100 bet that the final score won't surpass 3.8, and I am contributing to this potential loss myself. ;-)
...and some relativistic kinematics and statistics...
In the
ATLAS machine learning contest, somebody jumped above me yesterday so I am at the
fourth place (out of nearly 600 athletes) right now. Mathieu Cliche made
Dorigo's kind article about me (yes, some lying anti-Lumo human trash has instantly and inevitably joined the comments) a little bit less justifiable. The leader's advantage is 0.02 relatively to my score. I actually believe that up to 0.1 or so may easily change by flukes so the first top ten if not top hundred could be in a statistical tie – which means that the final score, using a different part of the dataset, may bring anyone from the group to the top.
(Correction in the evening. It's the fifth place now, BlackMagic got an incredible 3.76 or so. I am close to giving up because the standard deviation in the final score is about 0.04, I was told.)
I have both "experimental" and theoretical reasons to think that 0.1 score difference may be noise. Please skip this paragraph if it becomes too technical. Concerning the "experimental" case, well, I have run several modified versions of my code which were extremely similar to my near-record at AMS=3.709 but which seemed locally better, faster, less overfitted. The expected improvement of the score was up to 0.05 but instead, I got 0.15 deterioration. Concerning the theoretical case, I believe that there may be around 5,000 false negatives among the 80,000+ or so (out of 550,000) that the leaders like me are probably labeling as "signal". The root mean square deviation for 5,000 is \(\sqrt{5,000}\sim 70\) so statistically, \(5,000\) really means \(5,000\pm 70\) which is \(1.5\%\). That translates to almost \(1\%\) error in \(\sqrt{b}\) i.e. \(1\%\) error in \(s/\sqrt{b}\) (the quantity \(s\) probably has a much smaller relative statistical error because it's taken from the 75,000 base) which is 0.04 difference in the score.
It may be a good time to try to review some basics of the contest. Because the contest is extremely close to what the statisticians among the experimental particle physicists are doing (it's likely that any programming breakthrough you would make would be directly applicable), this review is also a review of basic particle physics and special relativity.
The basic purpose of the content is simple to state. It combines particle physics with machine learning, something that computer programmers focusing on statistics and data mining know very well and that is arguably more important for you to win than particle physics. (A huge majority of the contestants are recidivists and mass Kagglers, often earning huge salaries in data-mining departments of banks and corporations. Some of the "similar people" came from experimental particle physics but it's less than 10% so I estimate that 5% of the world's "best statistical programmers" of this kind are working in experimental particle physics.) You download the
data, especially two large files, "training" and "test".
The training file contains 250,000 events with weights (they are really telling you "how many times each event should be copied" for them to cover all the possibilities with the right measure, sort of). Each event is labeled as "s" (signal) or "b" (background). You and your computer are being told 250,000 times: look at this event, baby, and remember and learn and learn and learn, events like this (whatever it means) are "s" or events like that are "b". Then you are asked 550,000 times whether another event is "b" or "s" and you must make as many correct guesses as possible. It's easy, isn't it?
Well, your score isn't really the absolute number of correct guesses or the absolute number of wrong guesses or the ratio. None of these "simple" quantities is behaving nicely. Instead, your score is essentially\[
{\rm AMS}_2 = \frac{s}{\sqrt{b}}
\] It's the "signal over the square root of noise".
(
The exact formula is more complicated and involves logarithms but I believe that the difference between the simplified formula above and the exact one doesn't impact any participant at all – and you couldn't even find out which one is being used "experimentally" – except that the exact formula with the logarithms is telling you to avoid tricks to try to guess just a few "s" events and make \(b=0\). That could give you an infinite \({\rm AMS}_2\) score but the regulated formula with the logarithms would punish you by effectively adding some false negatives, anyway. The exact formula reduces to my approximate one in the \(b\gg s\) limit which you can check by a Taylor expansion up to the second order. Check this
PDF file with the detailed technical paper-like documentation for the contest which is an extended version of this blog post.)
The signal \(s\) is the number of true positives – events that you label as "s" and they are indeed "s"; the background \(b\) is the number of false positives – events that you label as "s" but they are disappointingly "b". More precisely, we are not counting the events as "one". Each event contributes to \(s\) or to \(b\) according to its weight. The weights of the training b-events are about \(2-5\) or so while the weights of the training \(s\) events are about \(0.001-0.01\) or so, i.e. two or three orders of magnitude smaller. These asymmetric weights are why the actual numbers \(s,b\) substituted to the score formula obey \(s\ll b\).
@Lubos--wow, impressive--left you a Facebook chat.
ReplyDeleteCongratulations Lubos! Really awesome. Although I think this award should not be only cash, it should imply an open door for the CERN too... many of us would be glad that our generic "humble correspondent" also would become our "humble correspondent at CERN" :-D
ReplyDelete" I am at the fourth place"
ReplyDeleteMotl Magnus! Motl Magnus!
Being correct carries no weight within managed research. The only trusted employee is one whose sole marketable asset is loyalty. Intelligence does not beg permission (and is rarely forgiven). Will there be a large enough fraction of minorities within the top ten finishers? Social advocacy lawsuits must be pro bono filed.
Impressive! I think I'll dedicate a few nights to this contest thing.
ReplyDelete@Luboš Motl So now you've given away a little of your strategy, does this mean you're leaving the contest, having achieved third place?
ReplyDeleteVery helpful post. I had a go on Kaggle after reading about this in the previously on TRF. It is fun playing with the "Starter Kits" they reference although I think I have learned more about getting Python to work with all these various add-ons than I have about data science. I am getting about 18% signal as the "sweet spot" (and am now only about 180 places below our humble correspondent.) Interesting to read the explanation as to why this is much lower than 30%. I also naively thought that increasing the numTrees was a surefire way of making progress, but it works for a while and then stops. Why is this? Overfitting on the training set?
ReplyDeleteIf you like Python be sure to check out Cython, Numpy, and Pandas.
ReplyDeletePython is wonderful for prototyping and flow controlling. Doing somewhat complex computations on large data sets, however, it becomes _way_ too slow in many cases. This is no reason to abandon Python, though.
Cython is an extension module for Python allowing you to compile Python code (with minor syntactic additions like static typing) into C-code. Typical speed-ups range between 100 and 1000 times usually with trivial effort.
Numpy is an extension module for Python adding multi-dimensional arrays with lots of functionality (like advanced slicing). It also interfaces perfectly with Cython allowing you to access the arrays with C-speed.
Pandas is an extension module for Python adding functionality to process large data sets, serialize and manipulate them. It allows you to read and write (large) text files easily and offers functionality like (un)stacking (analogous to MS Excel's Pivot tables), filtering, aggregating, and so on. Pandas also builds on Numpy and integrates with Cython, so everything fits together.
The described software is incredibly powerful and useful. Check it out!
Hi John, I have given away only 5% of my know-how about this stuff, nothing really about my code and algorithm, but yes, I am inclined to give up after I saw BlackMagic at 3.76 at the top. He's threatening my $100 bet about 3.8 not reached, too. ;-)
ReplyDeleteI will probably stop my daily 5 attempts unless I see some improvements in a couple of days, then I will switch to waiting for a revolutionary idea.
Uncle Al, your comments are so goddamn useless!
ReplyDeleteOn behalf of the BICEP2 collaboration, please do not compare our critics to Nazis or joke about suicide. We of course believe in our results but are open to constructive criticism. We would in no way wish to equate professional criticism to this sad episode of history. If you have questions on the matter, please contact us directly.
ReplyDeleteDear Mr Teplý, on behalf of the citizens of the free Wester world, please be aware that in 1989, I won the freedom of speech and I won't allow some politically correct random graduate students or their bosses to trample on it. If you have some questions what the freedom of speech means, please f*ck off.
ReplyDeleteYes, I am well aware, and we support free speech, hence the "please." Just let it be known to readers that this blog and the remarks therein are not to be associated with the views and opinions of the BICEP2 collaboration.
ReplyDeleteDear Mr Teplý,
ReplyDeletethere's no rational reason why you should be associated with – i.e. take credit for – my texts that are demonstrably and obviously my personal blog posts.
But if there are some people who like demagogy and similar nonsensical "associations", they will "associate" you, anyway. You can't really do anything against them.
Were you instructed to write this intimidating comment or was it your personal initiative? I am pretty disgusted by this way of directing the discourse.
No, the BICEP2 collaboration does not associate or take credit for your posts. As long as this is clear, you can post what you will.
ReplyDeleteClear or not, Lubos and anyone else can post what they will.
ReplyDeleteDipshit.
"As long as this is clear, you can post what you will."
ReplyDeleteThat's very kind of you, adolf, but just who the hell do you think you are to tell him what he can and cannot do on his OWN blog?
We've just been commemorating the 70th anniversary of D-Day and celebrating the subsequent defeat of the Nazis, but obviously prematurely — there's still a lot of mopping up to do. You jumped up little prick.
Shove your advice and your fuckwit holier-than-thou poncy romper-stomper cheap totalitarian 'moral' creed right up your bleeding arsehole.
To think that men died so that 'priests' like you could parade around breathing air. What a fucking waste of good lives. What a crime.
You've no idea what freedom is. Go to Hell.
I guess all readers here are aware of the fact that TRF is NOT associated with the BICEP2 collaboration ... ;-)
ReplyDeleteOf course I know that you appreciate justified professional criticism, as it should be ...
But I and probably others think that what happens even in public popular discussions, goes beyond (or more accurately below concerning the level) rational professional criticism...
And professionaly criticizing the criticism (if it contains errors in the understanding of the scientific method for example), is also an important thing to do.
All we do here is make use of our freedom to express our disagreement with the good work of the BICEP2 collaboration outright being bad-mouthed at some places in the online and real world in more or less strong words.
Cheers
A
ReplyDeletesociety that confiscates achievement to reward the smartless is not only
insane, it is evil.
Uncle Al was good to Europe, Patent EP0438043B1. Koenig was the master
machinist. Engineer Healy pissed off Uncle Al to obtain some Swedlow dumpster
fill (the "invention"). Yang and Krug were managers responsible for
getting their names on the patent.
Lumo and I disagree about exact vacuum isotropy toward
hadronic matter. Perform a geometric Eötvös experiment. A good idea need only
be testable.
Perhaps you should direct your attention to the efforts of Dr. Steinhardt who portrays your collaboration as not merely wrong but also sloppy and inappropriate in method and in announcing prematurely. Or you should be offended that the press especially Nature that first announces without report of your caveats and makes it sound as though you were simply over eager to claim credit and were unprofessional etc.
ReplyDeletethat is what I got out of it. Everyone who has enough intelligence to follow this realizes what Lubos said is correct. If you think kissing ass will help, fine , but when it turns out somewhere in the future that the possibility of dust has disappeared.
Should the 30 parameters not be called primaries and descendents .... ;-)?
ReplyDelete"you must give up the idea that for each of the 550,000 contest events,
ReplyDeleteyou calculate the distance from each of the 250,000 training events.... "
It is exactly the first thing I would do.
"Too much CPU time.)" Who cares? One can optimize it after (and it seems that you're undervaluing the clock's speed.).
"The training and test events in this contest are generated purely by Monte Carlo": so there emerges :-) ... that it's more a matter of gambling mathematics than of machine learning...
??
ReplyDeleteWhere that came from?
What a bizarre pointless intervention…
Congratulations!!!! Back to 3rd place with a 3.744.
ReplyDeleteI still think the 3.81 limit will hold. The extra 0.01 is for stats errors. Why?? Because it is so close to (i^(-I)-1). ;-)
I know you have better things to do, and I know that the answer to my question would be, if answered properly, too long for writing. But I am interested in what you may think about when you, for example, observe a phenomenon such as the rainbow. How do your thoughts flow? Do they start with human feelings of beauty or do they start with classical wave theory or do your thoughts wander into the modern domain straight away? Does your mind wander through all the historical contributions to the understanding of the photon or is it QED you see at first glance? Maybe string theory is too dominant in your thoughts to jump elsewhere? I hope I have asked the question in such a way that you understand me. I think it would be absolutely wonderful if you could account a personal narrative when it comes to how you think about seemingly everyday situations. It would not be often one could have the chance to "listen in" on a mind that knows so much about the place we live in. With kind regards, Rasmus.
ReplyDeleteDear Rasmus, I've loved rainbows since I was a kid, and was fascinated by double ones, and so on, and this still remains in me.
ReplyDeleteStill, there is at least an equally large new counterpart of this naive excitement - the ability to compute the position of the rainbow, color by color.
This was actually a standard question in my Rutgers PhD qualifying exams, one of thousands of things I mastered, and I think that I would still be able to calculate the angles. One must roughly know the transmission and reflection of the lightrays from spherical water droplets, and the detailed calculations according to the indices of refraction etc. that depend on the color nail it down.
I find this quantitative part of the rainbow to be at least as beautiful as the naive feelings from the rainbow. It's not a normal beauty. It's mixed with some gospel-like belief. I think it is cool we may exactly nail this problem down and I feel that everyone should know how to calculate the direction of the rainbow. It's so cool that it works.
You don't need string theory for the calculation of the rainbow but of course that all these calculations fit within Maxwell's electrodynamics and they're special cases of QED, Standard Model, and then string theory. So string theory is a theory of rainbows for me, too. I partially do think about rainbows (and thousands of different things) when I am thinking about QED, QFT, or string theory.
Is Uncle Al a computer or a real person?
ReplyDelete"I am likely to lose a $100 bet that the final score won't surpass 3.8, and I am contributing to this potential loss myself." I'll front you the $(USD)100 if you exceed 4.0.
ReplyDeleteNo, no - don't thank me. Thank US Socialist Security stealing from the productive to reward the despised. I'd be but a conduit.
Good question.
ReplyDeleteGood to see Tommiseeothingamagig is rooting for you!
ReplyDeleteFinally, the paper: "However, these models are not sufficiently constrained by external public data to exclude the possibility of dust emission bright enough to explain the entire excess signal."
ReplyDeletehttp://journals.aps.org/prl/abstract/10.1103/PhysRevLett.112.241101