Saturday, October 15, 2011

Should different disciplines require different confidence levels?

Sascha Vongehr discusses the contrast between the scientific community's attitude to OPERA's superluminal neutrinos and climate change. Very shaky observational results in the latter case are being sold as a reason for certainty (and perhaps a reason to ostracize or even criminalize the skeptics); amazingly strong experimental results in the former situation are being ignored "just" because they contradict a theory by Albert Einstein.

This topic was previously discussed on this blog where your right-wing humble correspondent ironically chose a letter sent to the Washington Post. Equally ironically, left-wing correspondent Sascha Vongehr chooses a text by Robert Bryce in the Wall Street Journal. But the point is the same: the U.S. newspapers say that different disciplines shouldn't have these remarkably different approaches.

Vongehr effectively agrees with Bryce and says "I told you so." His main point is that it looks – and is – manifestly inconsistent if 2-sigma pieces of evidence (5% or one-in-twenty statistical chance of error) are viewed as the "holy irrefutable proofs" that are supposed to silence all doubters in one discipline while 6-sigma confirmations of previous results (OPERA confirming MINOS' superluminal neutrino claims: six sigma corresponds to one part in half a billion chance of a false positive) are viewed as claims that should be ignored in another discipline.

Aside from the high confidence level of the OPERA experiment (6 sigma) and the fact it's not really for the first time, Vongehr also mentions the suggestive graph by Tamburini.

I agree with him that this is an inconsistent approach. Regardless of the discipline, the distributions of complicated quantities are guaranteed to be nearly normal (Gaussian), by the central limit theorem. For normal distributions, you may always translate the number of standard deviations to the certainty – a probability – that the deviation didn't occur by chance. The maths is always the same, regardless of the discipline.

So if you evaluate the "number of standard deviations" properly, i.e. if you quantify your statistical, systematic, and total error properly, then the same number of "standard deviations" should always lead you to the same degree of certainty or excitement, independently of the discipline.

Ideally, I agree with that. The contrast is immense. Climate skeptics – i.e. sane people who are able to see the self-evident fact that the climate alarmists are deluded and self-deluding lunatics having no significant evidence to back up their bizarre claims – are treated nearly as heretics by some folks who like to be called "climate scientists".

On the other hand, it is common for us to dismiss a huge, 6-sigma excess of the neutrinos' velocity above the speed of light in the vacuum (and I mostly dismiss it as well) which is, moreover, a confirmation of an older (somewhat weaker) effect measured in the United States. Robert Bryce and others are right: if you combine these two situations, science looks totally unfair, follows double standards, and shouldn't be trusted at all.

Is the conclusion right? Or can one justify the different standards?

Well, first of all, it is a sloppy conclusion that you should ignore all of science if you see such discrepancies. The obvious and right conclusion is that you should ignore the scientific discipline that has poor or no standards, the scientific discipline where would-be researchers are ready to crucify you if you dare to disagree with some insane propositions that have only been supported by a (cherry-picked and partially fabricated) 2-sigma piece of evidence. The discipline where the standards are higher should get a more careful treatment: watch out, you may be throwing the baby out with the bath water.

Second of all, even if all disciplines were perfectly scientific and followed the same standards, the required number of "standard deviations" depends on the prior probability that a proposition may a priori be right. In other words, you need just a few standard deviations if your statement was a priori plausible. However, extraordinary statements require extraordinary evidence (with many standard deviations: six is sometimes not enough).

Third of all, different disciplines of "science" live in different "cultures". When a physicist says "science", he may include diverse systems that may be studied but he surely expects the standards he knows from physics. In physics, we are really looking for mathematically reliable and accurate propositions that are almost as true as axioms and theorems in mathematics. You have to build on rock solid foundations. The only difference from mathematics is that the "primary" truths are not axioms invented by the humans but either experimental facts or axioms – laws of physics – that have been induced or extracted or guessed out of many experimental facts and that have passed many other tests.

If your plan is to build a scientific skyscraper, you need the bricks to be robust. Five-sigma evidence is really the minimum you want to see. Such physical research may be applied to solid state physics, elementary particles, optics, biophysics, neurophysics, atmospheric physics, physics of the Occupy Wall Street mob movement, or any other portion of physics. But it's still physics applied to a physical system. You expect some rigor. It's the scientific method as imagined by the physicists. It's algebra where you study not only numbers, groups, and matrices but also atoms, molecules, patterns, materials, phases, and other physical things and processes.

However, other scientific disciplines don't satisfy these strict criteria of the scientific method. They're just much less scared of the idea that they're wrong. They're much more often wrong and they're usually not humiliated for that. Instead of looking for reliable, nearly mathematically rigorous statements that simply have to work (almost) at all times, they're often playing games where a success rate that is slightly higher than what you would get by guessing is often considered enough.

So such semi-scientific communities are used to play certain games and pursue various obvious strategies how to play these games so that the "researchers" are being easily blown and pushed by random winds created by 2-sigma and sometimes even 1-sigma pieces of evidence. The idea they take for granted is that if you allow the wind to manipulate with you in this way, you have a higher chance to know something about the system – or to find the cure for a lethal disease or a hypothetically dying blue planet – than what you knew or what you could do at the beginning.

These semi-scientific communities never reach the degree of certainty that is common in physics – like 99.99999% certainty. Because of that, they can never have any "really strong prior probabilities" supporting the belief in a particular assumption (like we have in physics): a priori, anything goes. Consequently, a high number of standard deviations is never really required because there are no extraordinary claims in the discipline; all statements are pretty ordinary because nothing is really known. Do you think that the second-hand smoking causes no health issues whatsoever? Maybe it doesn't; there's simply no strong enough evidence (in the physics sense) in either way. The typical number of standard deviations that is "enough" for the people to believe something about these disciplines is low simply because everyone around you has low standards as well; the competition is not too demanding and doesn't force you to look for 5-sigma or stronger evidence.

It's clearly not physics applied to a different system. You can't build complicated chains of reasoning. If your argument depended on 5 other insights from the literature (in these low-standard disciplines), it's more likely than not that one of the assumptions is wrong so your conclusion (and your paper) will also be wrong. It's not a full-fledged science as understood by the physicists. It has completely different i.e. much lower standards. It's much less reliable. But we often include such semi-sciences into the broader or generalized concept of "science" and we have to allow those semi-scientists to investigate these disciplines semi-scientifically simply because there's not a sufficient number of physicists – or a sufficient number of highly intelligent people – to study these systems truly scientifically.

And even these semi-sciences sometimes lead to good results so it makes sense to fund them. Sometimes they even have better results than what an overly abstract physicist could achieve. What doesn't make sense is to allow these semi-scientists to prohibit other semi-scientists (and surely not full-fledged scientists i.e. physicists) to think about those matters properly and to reach different conclusions. Every climate alarmist who has ever suggested that it is illegitimate to realize that climate alarmism is a pile of feces should be physically assaulted and de facto silenced because blind dogmatism and bullying of this kind should have no room in the modern 21st century society, surely not when the dogmatism promotes totally preposterous claims such as those about a looming climate catastrophe.

If you're a non-scientist, you should never allow other people to prevent you from thinking something. On the other hand, I assure you that genuine scientists often have very good reasons to be very certain about various things and you should try to understand that. If you will study how stringent standards are being pursued by a particular researcher and his community, you may get a sensible idea about the degree to which you may take the community's certainty seriously (or the extent to which you may completely ignore their babbling).

Please, don't throw all of science out of the window just because you have found one tiny subdiscipline of science where the people suck as researchers and where they suck as human beings as well. It could be a very foolish decision of yours! Try to get a more detailed picture of the scientific community and its parts. Scientists are not identical just like people are not identical and you should try to go well beyond the naive claims of the type "science is bad"!

Thank you for trying.


  1. I'm a Bayesian. What are the priors for experimental error? What was the prior that Einstein was wrong?

  2. Dear Harlow, there's of course no precise quantification of priors but they can be estimated approximately.

    At the beginning, the priors for all "qualitatively different hypotheses" must be chosen to be comparable. Each of the mutually exclusive ones should have the probability of 1/N.

    As the evidence is accumulating, these probabilities are being adjusted, some of them are getting extremely close to 0 and others are getting very close to 1. Depending on what evidence one takes into account, one gets different numbers but sensible scientists should at least roughly agree about which priors are very high and which of them are low.

    At any rate, priors are not fixed, they're evolving as the evidence is accumulating and changing. So priors are not terribly accurate; but they're not really mysterious. The prior probability is just the posterior probability after the moment when you took your last evidence and performed the last Bayesian inference (before the new one that is awaiting you now).

    As the amount of accumulated knowledge and evidence grows, the importance of the "really initial" priors at the beginning drops.

    I think that the prior probability that Einstein is wrong before OPERA was so small that 6-sigma evidence isn't enough. More precisely, there's really a big evidence - previous evidence for relativity - that OPERA did some big mistake in their calculation of the mean value or the systematic error.