Sunday, December 03, 2006

Projection system championships

How does the CHONE system compare to other systems?

I tried looking at 2006 CHONE projections, compared to ZIPS, published on Baseball Think Factory by Dan Szymborski. I've only looked at hitters, and the typical method is to compare correlation for some summary level stat, such as OPS, EQA, or RC/g. In this case I'm using OPS.

Other systems worth checking into are Marcel the monkey (no relation), PECOTA, Bill James Handbook, and Ron Shandler's baseball forecaster. In addition, seems like everyone who publishes a fantasy baseball guide has a system. Each system should beat Marcel, the simplest. If you've got a statistical process that can't spank the Marcel monkey, then all your formuli and algorithms amount to nothing more than mathematical masturbation. I think Marcel's creator, Tango Tiger, has said that the best correlation a projection system can hope for is 70%, Marcel gets you 60%, and the advanced systems are around 65%. Or its 75/70/65, or something.

But I'm not sure what this really means, as you have to put a playing time cutoff somewhere. The higher your cutoff, the better your correlation should be.

So I looked at ZIPs and Chone to see how we did in the 2006 season.

Using 300 AB as a cutoff: ZIPS = .617, Chone = .615
400 AB: ZIPs = .648, Chone = .635
500 AB: ZIPs = .656, Chone = .661

Really close, but ZIPs gets the edge. I'll have to look at if either of us can pass the monkey test. I should be able to find that on the internet. In addition, I can look at Shandler and James, though probably only for the 500 AB cutoff, as I have those only in print and it involves quite a bit of data entry.