Saturday, March 04, 2006

Bible Studies

I've been reading John Dewan's The Fielding Bible, and if you're reading this site, you should too. There are some great features here, and some answer questions that have been asked in many a fielding article at Baseball Think Factory.

I'll start with zone rating, my favorite defensive measure, not because I'm convinced its the best but because its freely available at any point in the season. Zone rating is simply the percent of chances a player converts in the zones where at least 50% of balls in play are typically fielded. If a player fields a ball outside his zone, its added to the numerator and denomonator, but if he doesn't make a play it doesn't factor into zone rating at all. This can be a problem. Consider this example: Player A fields a ball in his zone, then watches a ball outside his zone go through to the outfield. Player B makes a great play just outside his zone, but the next ball, hit right at him, he boots. Player A has a zone rating of 1.000, or 1/1. Player B, having converted one out on the exact same opportunities, has a .500 zone rating, 1/2.

Zone rating could be more informative if balls outside your zone are considered separately, and that's what The Fielding Bible does in Revised Zone Ratings. While these are fun to look at, and exactly what I hoped zone rating would do, it doesn't quite match up to the zone ratings published by STATS. I took the Fielding Bible zone ratings, recalculated them by adding plays outside of zone back in, and compared them to STATS. While most players are within a few percentage points, there are some real outliers. Russ Adams comes in dead last by STATS at .779 but is .834 in Revised. Derek Jeter on the other hand drops from .831 to .803, Khalil Greene drops from .859 to .821. Its the same methodology, looking at the same players in the same season. They should match up better than that, and until they do people will continue to distrust defensive statistics.

Why don't they match up? There are two possible reasons:

1) Scoring - STATS and Baseball Info Solutions (the data behind the Fielding Bible) are looking at the same play and scoring them differently.

2) The zones. This is more likely the culprit. According to this grid used for STATS, there are 8 x 22 or 176 places where a ball can land. In The Fielding Bible (pg 9) there are about 260, though there is no map showing exactly how they are structured. I would lean towards trusting the system with the more detailed data, but I really have no idea which is closer to the truth.

The zone rating part is just a small part of the book. The rating used more often is the plus/minus rating, which is pretty much the same thing used by Mitchell Lichtman (MGL) in UZR. The difference is in the details, and I don't know all the detail used in UZR, though MGL upgrades it from time to time so an old article explaining the process may be obsolete.

For plus/minus, if there is a .26 chance that a ball hit in a given area is fielded, the player is charged .26 plays if its not fielded, and credited .74 if it is. If another player fields the ball, there is no penalty for the first player (a good thing IMO). Consideration is given for type of ball hit (liner, fly, grounder, bunt) and speed (hard, medium, slow) as well as positioning (1B having to hold bag with runner on first, hit & run for middle infielders).

Some of the results are very surprising, far off from what UZR numbers have been published, such as Steve Finley rated the #1 centerfielder for the 2004 season. After he was signed by the Angels, MGL posted Finley's UZR numbers, and they were horrible. I watched that horror unfold in the LAnaheim outfield last summer, so I know which data set I'm picking if forced to choose. At least both systems agreed he was pretty bad in 2005, and most of the numbers pass the smell test.

Robinson Cano was at -27 (in plays, not runs). I thought he was closer to average in UZR, though I can't remember. He's equally bad at going left or right, and average on balls hit right at him. After seeing he's gained 20 pounds since last year, all I can say is good luck with that, Yankees. Manny Ramirez was only -14, which is pretty easy to live with given his bat, unlike the -30 to -50 ratings from a few other systems.

While I have no idea if plus/minus is an improvement or even the equal to UZR, I can appreciate the level of detail. You don't just get one number, you get home/road splits, splits for lefty and righty pitchers, and going left/right/dead on for infielders. There's a lot of good info here if you can make sense of it, and I haven't even begun to grok it yet.

Finally, there's the "where hits landed" charts. While you would need to add context to make any analysis of this (like how many balls were hit to each location) they provide a simple check. Jeter and Cano have horrible ratings? Well, the Yankees gave up a ton of hits up the middle, in the 3b/ss hole, and the 1b/2b hole. the only part of the infield that wasn't a hole was the 1B line, thanks to Tino, the God of Clutchness. For the Red Sox, they gave up a lot of hits in the LF gap and down the LF line, supporting evidence for ranking Ramirez as a Manny-type defender.

They also gave up 42 hits off the wall, where the average team only gave up 10, and the reasons for this are as obvious as a 40 foot tall monster breathing down your neck. My guess, though I can't be certain, is that most defensive systems were counting hits off the wall against Manny, and they obviously shouldn't be. He's bad enough as it is.

The plus/minus system is in terms of plays made, and the enhanced version in terms of bases. There is no calculation of runs saved, though pg 12 mentions that this will be done in the future, and that the value of a single is "a little less than half a run" STOP RIGHT THERE! ALERT THE SABERMETRIC POLICE! I hope in the next year they read this, or somebody tells them that preventing a single also creates an out, and the single and out together are worth more like .773 runs, according to The Book, which I also have just started to read.

That's it for now, I've got to get back to reading the Bible.