Friday, February 07, 2014

An equation of intelligence

Alex Wissner-Gross' thoughts are probably too good to be true

Óscar Gómez asked me about a 12-minute November 2013 TED talk by Alex Wissner-Gross. He would talk about an equation or principle that produces all intelligent behavior.

I know Alex very well. We would often talk about his great inventions at Harvard where he was a PhD student. Now, he has degrees from physics, computer science, nanoscience, electric engineering, and various fields like that, and is an ambitious inventor and entrepreneur. Now he's affiliated with some Harvard and MIT artificial-intelligence-related institutes.

Some of his self-promoting paragraphs say that he's also trying to find a way "how to make people aware of climate change". That's of course distasteful, Alex. What about looking for a way to "make the people understand that the climate change activists are a dangerous terrorist pseudoscientific organization fighting a non-existing problem and trying to control the rest of the mankind"?

He starts with a quote – one he endorses and that was used to criticize the early computer scientists including Alan Turing – that asking whether a machine can be intelligent is as lame as asking whether a submarine may be swimming. I thought that he would discuss the intrinsic ill-definedness of the intelligence. What is intelligence? It depends, it has many layers and components, many manifestations etc., every definition will end up with slightly different results. And this ill-defined intelligence may exist to various degrees, too.

But Alex went exactly in the opposite direction. Intelligence is very well-defined and he may design a program (and claims to actually have designed a program called Entropica) that guarantees the pure intelligent behavior in every situation. The equation it follows is\[

F = T\cdot \nabla S_\tau

\] The intelligence is the force \(F\) that acts in the direction in the space of possible "actions" that tries to increase (or maximize) the number of options \(S_\tau\) (a quantity that Alex misleadingly interprets as the entropy) we will have at a future time \(\tau\). Here, \(T\) is an unspecified coefficient, much like \(\tau\) itself.

He marvelously claims that when this program is connected – in a way he doesn't specify – to the information about the stock prices, it starts to trade stocks and produce growing profits without being told about the "right goal". When it's connected to a small gadget that may balance on water, it starts to be balancing on water. It plays ping-pong and does lots of other things that look intelligent.

If true, it's amazing. But I just don't understand how the wonderful program could possibly work. In some sense, Alex with his inventor's mind is looking at things in some kind of a synthetic way. Quite generally, it seems to me that he is not decomposing the issues to the elementary building blocks at all.

First, he doesn't say what the timescale \(\tau\) is and what the coefficient \(T\) is. More importantly, the equation attempting to resemble classical physics is analogous to Newton's inverse square law \(F=Gm_1m_2/r^2\). But that law would be meaningless without some \(F=ma\) – a law that allows the force to be identified as the acceleration (second time derivative) of a coordinate. Concerning the "missing \(F=ma\) problem", let me assume that Alex claims that the intelligent behavior does include such a law. Perhaps, instead of a force, the force is linked to the first time derivative of coordinates? It should better be linked to something.

That's surely not where my problems end. As the letter \(S\) and his wording indicates, he wants to interpret the "future freedom" or the "future number of options" as some kind of entropy. But he clearly doesn't mean the full entropy – which is dominated by the entropy stored in atoms' chaotic thermal motion, even if we talk about the rather intelligent life on Earth. (A way to maximize the total entropy in the future is to burn lots of coal which may be intelligent but isn't automatically so because true intelligence has other, finer dimensions.) He must only mean some "part of entropy" carried by the relevant macroscopic degrees of freedom. But how does one precisely isolate them? Even if I know the exact description of a physical system (and most intelligent agents can't know it), it seems impossible to separate the relevant degrees of freedom.

To determine the right "intelligent behavior", the gadget must know how many "future options" different decisions right now produce. But the calculation of \(S_\tau\) for some future time \(\tau\) as a function of a decision at present is an example of a prediction. And the animals or machines simply cannot make predictions without intelligence, can they? So the definition is a circular one, kind of. I think that a major part of intelligence is something that allows people (or others) to invent the rules to predict the future. Much of their intelligent behavior follows from that. But this aspect is viewed as a "trivial input" that has to be calculated externally.

But the only way to correctly predict the future is to use some kind of the laws of physics or Nature. It's crazy to think that all intelligent subjects in the world are automatically equipped with the full knowledge of string theory. They don't really know almost anything about the future behavior of physical systems and their relationship to the past facts and they learn how to predict these things (except for things that are "hardwired to their brain", but these hardwired things are arguably straightforward and we may easily emulate them by computers; only the intelligence that one "adds" in his life is mysterious).

So most of the things are undefined in some way but the concept of intelligence as the "ability/desire to maximize the future freedom" is an attractive meme. I largely agree with it – I think that more intelligent people care about their freedom (multiplicity of options) more than the non-intelligent people, for example. But I don't believe it is a generally valid formula for intelligent behavior. And I think that in many cases when it is fine, it is vacuous.

By the previous sentences, I mean that in many cases, intelligent behavior is one that reduces the number of options or uncertainty about the future. Intelligence is needed for a NASA space probe to move to the right place where we want to have it. When we play chess, we want to reduce the freedom of the opponent. So these are examples how the "opposite law" sometimes seems more true than Alex's original law.

But I am really confused how his principle may produce the clever behavior e.g. in the case of the stock trading. What does it mean to maximize the future freedom? Even animals want to survive. Survival improves the future freedom because when you die at time \(\tau\), your option for times greater than \(\tau\) will be reduced to one option: lie in the grave, or be dispersed over India, or whatever is your preferred funeral format. So yes, the instinct to survive is a simple example of the desire to increase the number of future options.

Being rich also increases your freedom. You may choose any expensive hotel, send rockets to outer space, and so on, you may surely add 500 other things that you could do if you had a billion dollars. So it's sort of trivial that people want to have enough money when they trade stocks. But how does it tell the program to buy or sell stocks? The program must be equipped with the function \(S_\tau\) that calculates the freedom (or, well, money) at time \(\tau\), mustn't it? But that's probably impossible without predicting the future motion of the stocks. However, predicting the motion of stocks is the "bulk" of the problem we wanted solve in the first place. I can't see how Alex's equation has helped or may help to solve the problem. I would probably need to see more about Entropica's inner guts - but I am afraid that such a view into the interior would reveal that it doesn't really work as advertised.

Well, the very notion that the "most intelligent behavior" is objectively calculable sounds implausible to me, too. As I said, intelligence has many aspects and dimensions. But it also has many uses. I don't believe that science may really tell us what we should do. The issue is related to the claim that science cannot answer moral questions.

For the reasons above and others, I am skeptical. But even though I feel that despite his physics PhD, Alex's whole way of thinking is sort of incompatible with the theoretical physicist's understanding of the world as a consequence of impersonal laws of Nature (this perspective seemingly unavoidably renders most of the concepts interesting for the humans – such as intelligence – ill-defined), I still find some semi-mathematical principles like that applied to assorted philosophically appealing human concepts (intelligence, beauty, diverse, whatever) intriguing because I am sometimes worried that there could be an entirely different, yet scientifically robust, way to classify all events in our Universe. And even if the world is as sensible as I expect and none of these "laws" can be fundamentally true, I am still interested in possible applications of such paradigms because some of them could profoundly change the way how we live.


  1. Dear lucretius, your encyclopedic knowledge is impressive but I endorse Anna's key point and I would add that unlike "pop natural science" and "natural science", "pop history" may often be more sensible about the bigger points than "history" because the latter isn't science. The professional historians often differ from the unprofessional ones by being more ideological.

    It seems really weird if you deny the links between the witchcraft trial and Christianity. I've been to Salem, Massachusetts, and we visited the museum(s) etc. That contained some lectures, and so on. At any rate, check e.g. this thing.

    It just became normal between 1560 and 1670 to believe that Satan - a theoretical construct shared by the Abrahamic religions including Christianity - is omnipresent on the Earth. The denial of is presence was interpreted as the denial of angels, too, and that was too bad.

    In "Against Modern Sadducism" (1668), Joseph Glanvill claimed that he could prove the existence of witches and ghosts of the supernatural realm. [Quoted from Wikipedia.] He was a clergyman, the kind of writers who would matter for the popular interpretation of the Christianity.

    Such burning surely looks paganic but it was justified by a leading interpretation of the Bible, too.

    At any rate, science done right requires much more than just "not support burning of witches" and the Church's power structure just didn't provide the sufficient environment for that, so much of the enlightened era may be attributed to the loosening of the Catholic shackles.

  2. You wrote: “I am still interested about
    possible applications of such paradigms because some of them could profoundly
    change the way how we live”
    One of the applications should be free electric
    energy out of harnessed micro black holes on earth, and at least the violation of
    the second law of thermodynamics.

    In that case it would change the way how we
    live profoundly.

    For instance we could stop carbon burning
    and influence the large scale weather
    and thus the climate, especially for
    desert locations and typhoon cradles.

    See: the impossible zero point electric
    black hole.

  3. Lubos,

    Do you think that physics will contribute to some sort of rigorous understanding of human consciousness in the foreseeable future? Max Tegmark has recently broached the subject, and Feymman comments in his lectures, page 41-11 in my edition, that the Schrodinger equation is likely a good starting point. A view of humans as an information processing system with our mind’s neurons as a 3D binary array doing the processing seems very intuitive and ripe for mathematization to me. Indeed, there seems to be some researchers using dynamical systems methodologies to such ends already. I would be very interested in your views in this area.

  4. Thanks for sharing the video. It's awesome. I personally love this type of twist of mind, another way to look at things, a new angle... no stones unturned if one wants to find the truth right?

  5. Alex does exercise the empirically optimum trading strategy - uninformed random, arXiv:1303.4351v4

  6. The stock trading thing gives it away. only gypsies think that buy low sell high is a strategy

  7. "He marvelously claims that when this program is connected – in a way he doesn't specify – to the information about the stock prices, it starts to trade stocks and produce growing profits without being told about the "right goal"."

    If he believes that then he is obviously not very intelligent. Maybe he once was but he lost it along the way. (Some people used to say the same about me.)

  8. I can not make much out of this strange "gradient force law" for intelligent behavior ...
    But this TRF article is fun to read and I would like to be automatically equiped with the full knowledge of string theory ... :-D!

    Scientification and Mathematization would be what could make me interested in some otherwise too soft topics too, but this strange gradient force law does not convince me .

  9. Reminds me of Feynman's law of the Universe
    U =0. He was being ... Well Feynman. U is the unworldliness so that you can put all our laws of physics and whatever else on the right side so long as unworldliness vanishes. Read it first I think in one of his popular works but can't recall where.

  10. "If you think about the word "purpose" and if you realize that this is a favorite religious theme in the context of "God's plans", you can't be too surprised that Aristotle became a darling of the medieval Catholic bigots who have hijacked the education system and introduced the system of mindless mass indoctrination by worthless, dogma-based pseudoknowledge. This indoctrination was known as "scholasticism"."

    You gotta admit, Lubos don't pull his punches. That's probably one of the reasons I like him, even on those occasions when I might disagree. I don't have an opinion on this one however. Got to study up first. :)

  11. LOL, I liked U=0, too. But Alex's law is sufficiently more specific so that it may be falsified and a paper does claim that it is easy to falsify it as it is wrong.

  12. I apologize if the direct secular yet conservative approach looks harsh or unusual. It is unusual in the U.S. but it is pretty dominant in large intellectual quarters my homeland of infidels.

  13. This is actually in Volume II of the Feynman Lectures, Section 25-6:

    But note that Feynman is being very clever. He introduces the silly U=0 in order to distinguish its trivial simplicity from the deep simplicity of Eq. 25-29, which he relates back to the experimental observation of Lorentz invariance. Very clever, very deep.

  14. Tom, thank you very much for the reference

    to Feynman related to this topic. I thought I was reasonably well acquainted with the Feynman lectures, but I would have bet a significant amount of money that RPF did not speculate about God anywhere in the lectures. I would have been wrong. He ends that chapter with a wonderfully insightful bit of exposition.

  15. Thanks. I hadn't looked at that since college in 74 I think. Very pgood so it stuck deeper than I realized.

  16. Lubos,

    Another scientist, Dr. Marcus Hutter, has a recent popular article called "To Create a Super-Intelligent Machine, Start with an Equation." The link to the PDF on his site wasn't working, so please review this arxiv link: . This article should capture some of the same data.

    Their definition of intelligence relies on the incomputable Kolmogorov Complexity and Solomonoff Probability. These incomputable items may be more to your liking for capturing the "ineffableness" of intelligence. They say that intelligence is linked to data compression, knowing the length of a short program to compute an object.

    Onto speculations. I have to ask, do you think that any stringy methods will ever shed light on better data compression algorithms? I am thinking back to Dr. Johnathan J. Heckman's paper last year "Statistical Inference and String Theory" which talks about a collective of agents' decision making process being linked to a non-linear sigma model.

    Hopefully, someday, all of the string theorists, the statisticians, the deep neural network experts, and the computational neuroscientists can collaborate.


  17. Dear Lubos,

    Wasn't sure where to ask this, but could you write an article explaining what virtual particles mean in the correct interpretation of quantum physics?

    I ask because I hit a weird article by Vongehr claiming that they are interactions from other universes, which strikes me as definitely wrong. But I don't have an alternate understanding of the phenomena myself that would allow me to know why its wrong. He says that these virtual particles are needed to understand why particles decay over time with a predictable probability. Why do isolated systems actually decay? Is it because of these virtual particles after all?

    The article is here if you are interested:

    I appreciate any response, and sorry for being off-topic, but again I wasn't sure where the right place to ask was.

  18. Yes, I was going to mention Roger II next, now you've saved me the trouble. And I had already mentioned Frederick II in connection with the perpetual state of subjugation of the Jews that he had adapted from Pope Gregory IX's decretals and that he in turn passed on to future Holy Roman Emperors. Your insights on Polish history are especially welcome, as my knowledge there is particularly weak.

    When people praise the changes brought on by the Internet, they sometimes mention "dis-intermediation", i.e., the possibility to "cut out the middleman" by learning and communicating directly. This is not wrong, but I would add to that a new kind of "intermediation". When amateur chess enthusiasts encounter a chess grandmaster, they often want to sit down immediately and play a game, not in hopes of beating the GM but for the sake of the learning experience and inspiration. For the GM, however, it is not a productive use of his or her time to wait minutes (half an eternity) for the amateur's next move while only needing a few seconds to decide on his or her own next move.

    The solution is a simultaneous chess match! A dozen or more amateurs can each be sitting at a table pondering their move, while the GM walks round, takes in the situation at a glance, moves, and then continues to the next opponent.

    For a slow thinker and even slower writer such as myself, the Internet is the perfect intermediary that de-synchronizes my personal time with yours and that of our host, allowing us to conduct meaningful conversation more efficiently than we would be able to in "real time".

  19. From my experience many girls had better grades than me in elementary school (even in math) and I was a well behaved kid (at least in front of teachers). However they also gave it more effort than me, I never opened a textbook at home, mostly just winged it. The same approach yielded a bit better results in grammar school. In college I was already a level above them and their understanding of math seemed childish. With some effort they couldn't keep up.

    I've always thought it's because of memorisation combined with greater effort which doesn't work as well when things get more complex.

  20. Here's my physical theory of intelligence:

    Definition: Intelligence is the ability to process information efficiently

    How to measure efficiency? One way would be to measure the amount of Landauer principle heat generated by the computer or neural network (of course, at the moment the amount of Landauer heat in minuscule compared to the heat caused by the 'machinery' of the computer, so instead the efficiency can be calculated).

    How could this be used? I think this efficiency principle could be used as a 'goal' in training a neural network.

  21. The question is, why is it easier to invent mechanisms that can model dumbness rather than intelligence?

  22. Hi - I think human notion of intelligence is too complex and cannot work from the top down but you have to ask what is its true simple biological purpose where I completely agree with your sentence: intelligence is the ability to predict what happens in your surroundings - not in the physical way but purely for survival - find food and dont get killed doing so. The fundamental reason intelligence has evolved is not to increase the number of possible future states but to achieve only two: eat while not being eaten - humans have changed their environment sofar that they can waste energy on playing pingpong or trading stocks, but those are only playful side effects - to make an intelligent machine you have to start with basics and make it hunt and be hunted....


    but, but, I havent a degree in anything, and consider my selfs as a humble process tech, but have maintained a healthy bullshitt detectore thue out my life.

    have a nice day

  24. wouldnt intelligence, or life in general, be considered not a natural state to a computer which can only know entropy?

  25. As soon as entropy was mentioned, Ross Quinlans ID3 algorithm came into my mind, from the good old times of the "general problem solver". Old wine in new bottles.
    Maybe he proves his concept by chosing a domain with a comparably small environment of rules, chess for instance.
    Anyway, the whole thing smells like Cold Fusion in a way, and I bet it won't "cook" as well.

  26. Sort of off topic but

  27. Eelco HoogendoornFeb 8, 2014, 6:41:00 PM

    Tl;DR: the number of degrees a person has proves only ones ability to jump through hoops, and has no necessary connection to any notion of intelligence.

    TL;DR 2: this is the horrible franken-offspring of handwavy continental philosophy with a major case of physics envy

  28. Eelco HoogendoornFeb 8, 2014, 7:12:00 PM

    It has always struck me that to 'excel' in terms of grades at any level of schooling, the most important abilities are to be able to sit still and regurgitate. Ive never felt any particular envy at the fairer sex for outclassing mine on that front. That the system rewards questionable qualities isn't news to me; but Ive always written off my personal perception that girls were given an easier pass as a bias in my perception (that said, I have quite often copied bullshit assignments word for word from girls, and scored consistently lower). To see this so directly quantified shocks me a bit still though.

  29. I think that this is not a local, but a global (or Western) effect. It exists in Canada as well, particularly in primary schools. Most of these teachers are female.
    Girls are generally neater, quieter, and more obedient and willing to please than boys. These behaviours are rewarded, and any willful or recalcitrant (original) behaviour is punished. Now females are dominating the universities in numbers, and, for example, a typical medical class is around 65% or more female.

    This imbalance is causing problems because women (for good reasons--families etc) work fewer hours (in most cases, many fewer) than men, and there is an acute doctor shortage.

  30. Oh no. Not another Drake "equation"!

    I met a Professor in Germany who instantly recognized the concept of "doof-studiert"; of continuing education as a path to stupidity. The result is either trying to fit everything into the boxes about which one has learnt; or being blind to everything outside of those boxes.

    In Australia, it's sometimes referred to as Degrees of Stupidity. There are many intelligent people who remain stupid, regardless of the certificates which they have acquired "academically".

    It's not that study makes one stupid; it's the process of unwittingly being "bottled up" by the studies to the exclusion of everything that doesn't "fit".

  31. He biases his tests in a subtle way by setting up the problem to be too straightforward. The problem 'if a stock is trading in a range, find the optimal trading rule' is such an example. Given the premise a bayesian technique can find the optimal solution, but problem is that many stocks look like they trade in ranges but don't always, and in those cases where the stock breaks out you lose a lot of money. The real question is finding a program that can predict what assets will trade in a range, and those programs fail quite predictably (all overfit, subtly of course).

  32. LOL, it reminds me of my dad who is clearly assuming trading in a range at all time. After his stock is down about 20% from the price where he bought, he still calls it "now it's down; it's surprising it's not back up yet".

    Impossible to explain that this is not necessarily the right theory about the motion.