## Thursday, December 18, 2008

### Dow Jones similarity graph

Besides WeatherData, Mathematica 7 has another function that is crowded with data, namely FinancialData.

If you ask what is Length[FinancialData[]], the result is 189,616. It means that Mathematica knows about the detailed history (and present status) of stock prices, exchange rates, and similar numbers from 189,616 companies, currency pairs, and their generalizations.

So I tried to change the "core function" of an algorithm to draw a similarity graph, a set of procedures that have already been applied to the picture of David Gross.

However, I was not connecting pieces of pictures but rather 30 companies in the Dow Jones index. This is the result for similaritygraph[2]:

Click the picture to get a larger screenshot that also contains similaritygraph[3] and some commands. Note the nice "supermarket/food" subgraph of 5 companies with Johnson & Johnson in the middle, surrounded by Procter & Gamble, Kraft Foods Inc., and Wal-Mart Stores, Inc. followed by McDonald's Corporation. It seems to make sense - probably more sense than the decoupling of IBM from the technological corner of Microsoft, Intel, etc.

I must tell you how I determined the "distance" between the companies. I took the daily stock prices since January 1st, 2007 through December 17th, 2008. The logarithm of all these prices were computed so that the overall scaling becomes irrelevant. All these graphs (lists of numbers) were supplemented with their mirror, anti-chronological  copies, in order to obtain an even periodic function for each company and to be ready for the simplest - complex/complex - Fourier transform function available in Mathematica.

I calculated the discrete Fourier transforms of these functions which are only composed out of cosines. The zero mode, i.e. the cos(0t) constant function, was dropped because this is exactly what encodes the unphysical overall scaling of each stock price. The first 20 or so nontrivial cosines' coefficients were taken seriously and interpreted as a 20-dimensional real vector linked to each company.
See the Mathematica notebook
The graphs above show the nearest and non-nearest neighbors in the 20-dimensional space.

By the way, when this sentence was being written, this blog's right sidebar was showing oil price at \$36.56 per barrel which is less than 1/4 of the peak price just five months ago!

But for all those who are going to forget about prices, oil, blogs, and the Internet already today or tomorrow, Merry Christmas!