Here’s a question I’m curious to answer: What method better projects team defense, stat projections or scouting reports?
One thing I need to answer the question is an objective criterion to measure. I don’t think there is anything out there except team defensive efficiency record. If a team has very good defensive players, they should make more outs than average, and post a better DER. DER is not a perfect measure though. The ballpark has an effect on this, as does the types of batted balls your pitcher allows.
Here’s my method: Start with the league average DER. Park adjust this. Then look at how many groundballs, flyballs, line drives, and popups the team allows. The formula I used, along with the batted ball data from the Hardball Times team page, is this: expected batting average allowed = GB*.237 + FB*.16+ LD*.718 + pop *.01. xDER = 1- XBA. Then xDER/league average is your batted ball adjustment factor.
Now, let’s look at the projected defensive stats of the players on your team. I’m using preseason projections, but prorated to the actual 2008 playing time. For the stat projections,
mine were made available shortly after the 2007 season on my site here. For scouting reports, I’ve converted the results of Tango Tiger’s
scouting report by the fans into run values.
I need to use custom weights of the report’s attributes for each position. For 1st basemen and outfielders I ignore the arm ratings. It’s not that they are unimportant, especially for outfielders, but they have nothing to do with how a fielder turns a batted ball into an out, which is what DER, and most defensive metrics measure. For 2B, 3B, and SS, the arm as well as range determine how efficient a fielder is at recording outs. The weights used are:
1B: Instincts 44%, 1st step 44%, Speed 12%
2B: Instincts 22%, 1st step 22%, Speed 11% Hands 22% release 11% strength 6% accuracy 6%
3B: Instincts 15%, 1st step 15%, Speed 10% Hands 15% release 15% strength 15% accuracy 15%
SS: Instincts 18%, 1st step 18%, Speed 12% Hands 12% release 18% strength 12% accuracy 12%
OF: Instincts 17%, 1st step 33%, Speed 33% Hands 17%
Next I find the average of each position – I just take the simple average of all players at their listed primary position. Then I find every player’s rating at each position, and convert to runs. For most positions, (rating – lg average)/2 gives you ratings that appear to be on the same scale as the stat-based projections. For 1B, where a player gets fewer chances, replace the 2 with a 3, and use 1.75 for shortstop.
Next, I take both sets of projections, find out how many plays saved each team projects to, and modify the expected DER accordingly. For this exercise, in the interests of time, I only looked at players who played at least 250 innings in 2008. If a player did not have a stat projection at the position played, I used zero for his projection. I did the same for the fan projections, except that if a player was evaluated, I had a projection at any position he might play in 2008, regardless of whether he ever played there before.
Here are the results:
Team statDER fanDER RealDER
ARI 0.686 0.688 0.687
ATL 0.685 0.688 0.695
BAL 0.706 0.707 0.691
BOS 0.695 0.694 0.700
CHA 0.694 0.696 0.688
CHN 0.701 0.706 0.706
CIN 0.687 0.688 0.674
CLE 0.695 0.692 0.686
COL 0.678 0.672 0.679
DET 0.706 0.701 0.686
FLA 0.689 0.691 0.694
HOU 0.694 0.692 0.699
KCA 0.694 0.691 0.691
LAA 0.696 0.699 0.693
LAN 0.692 0.697 0.693
MIL 0.694 0.696 0.700
MIN 0.690 0.690 0.690
NYA 0.694 0.697 0.684
NYN 0.690 0.691 0.699
OAK 0.703 0.698 0.701
PHI 0.691 0.688 0.696
PIT 0.679 0.680 0.676
SDN 0.698 0.695 0.697
SEA 0.704 0.706 0.683
SFN 0.688 0.685 0.686
STL 0.685 0.685 0.697
TBA 0.712 0.712 0.712
TEX 0.678 0.680 0.673
TOR 0.696 0.695 0.706
WAS 0.685 0.685 0.690
By RMSE (Root mean squared error) the results are:
Stats: .0084
Fans .0087
Going without defensive projections, using the ballpark and batted ball adjustment only (BB), we get an error of .0091
Using correlation, the results are similar:
Stats: .552
Fans: .534
BB: .450
The fan report would probably do better if I used multiyear data to create the projections. I’m not sure what the proper weights by year should be. Most likely, an even better projection could be created by some combination of stats and fan reports. I hope that the framework used here makes sense, and can provide an objective measure to evaluate defensive projections.
I should note that this can only be used to evaluate defensive projections, not defensive ratings. By comparing to DER, a stat like my own TotalZone would probably look better than a far more detailed stat, such as UZR or John Dewan’s plus/minus. That is because TotalZone, once adjusted for ballpark and hit type, is pretty much the same as DER. Since the more detailed stats capture more fielding ability, and better strip out the luck, they should better predict next year’s DER than a simpler stat.