Friday, May 15, 2015

### Fun with ImageIdentify.com

Stephen Wolfram and his folks have launched another rather incredible website, one that tells you what a given photograph contains.

Wolfram Language Artificial Intelligence: The Image Identification Project (some official blog comments by SW)

I've tried lots of pictures. A photograph of the Pilsner Tower – the St Bartholomew Cathedral in Pilsen – was correctly named a "church". The luxurious apartment building "The Ehrlich Palace" was guessed to be a "hotel". Two adults with two kids looking to the camera are a "person".

A Wiener schnitzel with the potato salad turned out to be "food".

Amusingly enough, a fat girl in the train who was just decorating nails or something like that was identified as a "vertebrate". LOL.

Four cars in the street are named a "vehicle". The bottom part of a toilet is a "vessel". A Lumia smartphone screenshot is a "device", and so is a favicon with a "pi" letter. A field with a forest on a hill ("Chlum") from which the longest Czech tunnel will emerge at the end of the year is a "sand trap" (in more detail, it is described as "a hazard on a golf course").

Some of the results don't look so great. A photograph of a page from a paper notebook with some quantum gravity scribbles seems to be "instrumentation", and so is an Australian 10-dollar banknote. A U.S. map with blue/red election results with the "Bush country" caption is an "artifact" which is pretty disappointing. An even worse result is a photograph of the Charles River with Boston behind it – the Hancock and Prudential Towers, the most typical picture of Boston you may imagine. ImageIdentify.com thinks that Boston is a "hard hat".

A picture of Nima Arkani-Hamed that I took in his fancy Harvard office turned out to be a "secretary". Sorry, Nima! ;-)

A swan carousel turns out to be an "aircraft". ;-) A bride kissing the bridegroom is a "hope chest" (probably because a bride and a hope chest appeared together on a different photograph).

The identification could be perhaps compared with that of a 3-year kid – and one still feels that the Wolfram server uses much more "brute force" and much less "natural intelligence" than the kid – but it's progress towards artificial intelligence, anyway.

I have some feeling that this Wolfram Language function could be enough to "crack" some new replacements of Captcha that appear e.g. at Backreaction.blogspot.com where I had to choose which 3 pictures out of 9 food photographs are "sushi".

The IT technology able to identify the pictures is now available as the ImageIdentify[...] function in the Wolfram Language. The language already contains rather complex functions. Sometime in the future, people will communicate in the Wolfram Language, too. ;-)

In related developments, Chinese IT company Baidu has beaten Google's record score (and Microsoft's almost equal) in some image identification. I guess that those big companies run a more powerful code than the Wolfram Language but you can't play with everything they have, at least I can't.