paranumeral: May 2011

Friday, May 20, 2011

Performance running total

Added a running total of the % return over market (see TotDiff% column). Since the history is in reverse chronological order the top entry shows the current total (as of yesterday's closing). So paranumeral is trailing the DJIA by 3%. The largest contributor (after trimming history to just the latest model's entries - deployed 5/12/11) is CXW with -10.6%. Turns out Pershing Square Capital dumped 3.5M shares on the 16th. Hopefully CXW will correct over time.

Thursday, May 19, 2011

Splits

Noticed the following

5/6/2011 11:44:37 PM

hlf

0.6459572

14152.50

104.84

14159.80

53.29

0.05

-49.17

-49.22

and realized that it is due to Herbalife doing a 2 for 1 split on May 18th 2011. Paranumeral fetches just the price deltas (as in just yesterday's) so as to minimize traffic but it seems we have a problem with splits. So I'll either do the adjustment (complex) or fetch all historical prices to ensure they're adjusted for splits. In the meantime this affects total performance very directly and significantly - splits are huge changes in price! Furthermore this issue plagues model building by adding lots of noise. So I have my work cut out.

Monday, May 16, 2011

Performance Tracking

Over the last few days I added performance tracking facilities to the website. One can view own or system prediction performance history (currently latest 20). There's much to improve but it is a good start. One thing to keep in mind is that the latest prices are from the previous business day. Today's predictions are not shown for that reason.

Friday, May 13, 2011

Whooaa - what happened to the numbers!!? AGAIN

Is that your final answer? Yes. Maybe.

After much toying with different models and different model responses the current (and hopefully final) prediction response is the Geometric Daily Return Average. Well the difference of that for the symbol and the market respectively. The averaging is over the prediction horizon which is currently the 3 months from financial statement date - not today and not when it was filed but rather the statement period ending date. The return is not in percent terms so multiply by 100 (until I do so in code).

I believe this is much better than any scale no matter how granular let alone the outperform/perform/underperform one. Yes, predicting the expected return is crazy impossible so the system will be used very much (as a classifier) along the lines of outperform/perform/underperform (basically go long, ignore or go short vs market). Regardless there's power in that bit of information.

Tuesday, May 10, 2011

Follow us on Twitter

Paranumeral is now on Twitter.

Each morning predictions are made using new statements and prices fetched the night before. Only those over a certain magnitude (absolute value of the prediction) and symbols/tickers not ending with .OB or .PK get twitted about. That is so no one gets flooded with tweets about predictions which are basically "market perform" and with tweets about tiny caps.

Since no magnitude works for everyone and just cause hashtags are cool, the system bands and hashtags the predictions by prediction magnitude so folks can filter out stuff. Tickers are also turned into hashtags. Maybe I'll add or change the current tags to the standard #outperform, #underperform, #perform etc.

Tuesday, May 3, 2011

New models

I have been actively building, analyzing and deploying new models based on the new infrastructure. In order to easily swap models in and out the feature calculation engine is now running on all cylinders - calculating all 650 features - and running a lot slower than before. Some used few, some - like the current one - used a lot of them.

Speaking of the current model; it is based on roughly 60,000 company statements, both quarterly and annual spanning 15 years; from December 31st, 1995 to December 31st, 2010. It makes good use of a couple of hundred features, mainly past relative price patterns and variations on dozens of measures of company strength versus peers.

The expanded date range (compare to previous models built on 5 years) provides the model with data points of various "markets" - bulls, bears, bubbles etc.

I ended all CAPS predictions so as to start fresh with these new models. Will most likely add functionality to the website which does similar tracking so I don't have to spend as much time just entering predictions manually in the Motley Fool CAPS.