Tuesday, December 09, 2008

Uberstats

I've been working on a uberstat database, calculating wins above replacement for everybody in the Retrosheet era. This includes park adjusted batting runs, which are based on custom linear weights, so the team baserun total adds up exactly to their actual runs scored. It was a bit trickier than it sounds, as I needed to remove pitcher hitting, or else I'm going to rate the AL players too low. This should be similar to the results of batting runs on baseball-reference, but sometimes there are decent sized differences. If a player consistently played on teams that scored more runs that you'd expect given their batting stats, the player will get extra credit.

I also removed baserunning and GIDP runs - say a team is 30 runs above average on the bases, and scores a total of 750 runs. I don't want to double count the baserunning, so I figure the LW values of the singles, walks, homers, etc. as if the team scored only 720. That way the batting runs + baserunning = actual runs.

Baserunning includes steals and caught stealing, as well as tagging up, going first to 3rd, etc. It tries to be a comprehensive evaluation. GIDP runs is based on how many DP's you hit into, given your number of DP opportunities.

For defense, there's TotalZone, outfield arms, and infield double plays. Plus catcher runs based on SB/CS, PB, WP, and errors.

All of these are converted to wins based on a custom runs per win figure for that league.

Position adjustment, per 150 games, is as follows (in wins): catcher +1.0, SS +.75, 2b/3b/cf +.25, RF/LF -7.5, 1B -10, and DH -15.

Finally, the difference between average and replacement level. This varies by my league strength calculations, between 1.8 and 2.2 wins per 150 games. For the 50's and 60's the NL was better. In the 70's and 80's it was about even, and in the 90's and beyond the AL has taken the lead. All of this is based on relative performance of players who played in both leagues.

The biggest surprise for me was Willie Davis topping the 70 win mark, which is a level of greatness. He played in an extreme pitchers park in one of the toughest decades to put up numbers, but in his context was an above average hitter, about 100 runs worth. He stole a lot of bases, ran the bases well according to retrosheet data, and rarely hit into double plays. His baserunning was worth another 100 runs, and he played outstanding defense in center field. He was about 350 runs above average, and in the 60's the runs to win conversion was pretty low. Add in replacement level for a National League that was superior to the AL at the time, and Willie gets 71.5 wins.

Pulling out my win shares book, I see Willie had 322. Just behind Ron Santo (325) and Reggie Smith (324) and just ahead of Nettles, Trammell, Simmons, and Torre. So perhaps Willie deserves more mention among the greatest players overlooked by the Hall of Fame.

34 Comments:

At 6:36 AM, Blogger jinaz said...

I've made initial attempts at such a project a few times, but never saw it all the way through. Part of it's time, part of it's database know-how.

Would you be willing to share it (even just some of the output tables) at some point? I'd be pretty excited to get access to such a resource--even if it comes with some qualifications.

Thanks,
Justin

 
At 9:32 AM, Anonymous Anonymous said...

Definitely yes on the output tables. Not sure when, but I'll put a few leaderboards up for the HOF and Hall of very good, Active players, and guys who come close.

Part of the problem is I'm still tweaking it. I get it all done then look at one part and think, why did I do that? I can make it better. The size of the whole thing makes it impractical to distribute.

 
At 6:23 PM, Blogger jinaz said...

I understand on all counts. And no hurry at all. Just wanted to encourage you to share it if you could--and you usually do share your stuff, which I appreciate.
-j

 
At 9:10 PM, Anonymous Anonymous said...

This is good stuff. I've got one complaint, though. You say:

If a player consistently played on teams that scored more runs that you'd expect given their batting stats, the player will get extra credit.

If you are assigning the extra credit, why not use LWTS broken down by base-out situation? That way the runs will add up exactly (except for partial 9th innings and so forth) and the players will get as much credit as they deserve.

In order to get base-out LWTS customized for the run environment, you can plug the team's composite batting line into a Markov model. It might be necessary to include baserunning in the model, but that shouldn't be too hard.

 
At 9:16 PM, Anonymous Anonymous said...

Another comment: are you familiar with Dan Rosenheck's work at the Hall of Merit? He's put a ton of time into creating his uberstat. Some of what he does is questionable, like the adjustment for the "ease of domination" of each league-year, but he's done good work on the changes in replacement level for different positions over time.

 
At 9:51 PM, Anonymous Anonymous said...

Okay, last point. I was curious about Willie Davis, so I did a quick-and-dirty estimate of his batting runs using B-R:

Runs = OBP*SLG*AB - lgOBP*lgSLG*AB

That gives 59 runs. The lgOBP and lgSLG are adjusted for park, so there should be no problem there. Meanwhile, the batting runs figure that B-R publishes (Palmer's LWTS?) has Davis at 39 career runs above average, also park-adjusted.

Possible reasons for the discrepancy between those figures and yours:

1. Something to do with the treatment of pitchers as batters. (But, wouldn't that work in the opposite direction?)

2. Different park factors.

3. The Dodgers scored runs more efficiently than their component stats would predict. This would also inflate Davis's win shares, but I think not his WARP. Comparing WARP1:

Santo 115.5
Davis 111.2
Torre 105.5
Smith 104.8
Nettles 100.7
Trammell 101.0
Simmons 99.1

Davis holds up quite well. Of course, considering all the extra junk that goes into WARP, this is inconclusive.

4. Something else? Clearly your linear weights are quite different from Palmer's, to the tune of 60 runs over a career.

Anyway, I am interested to see how this resolves.

 
At 5:27 AM, Anonymous Anonymous said...

LWTS by base-out situation sounds way beyond my capabilities. I'm not sure how I'd reconcile to actual team runs. I'm using the batting stats straight from baseball databank, as retrosheet event files are missing some games in some years. I can't always get the totals to match the player's actual stats.

I'm familiar with Dan's work, we've been in contact and I've sent him some of my stuff - TZ, catchers, to work into his ratings.

With Davis, even using the B-ref 39 runs I'd have him as a 65 WAR player - that's better than Andre Dawson. Seeing that win shares and warp also put him in the same range of Ron Santo, I'm surprised he didn't get more support from the Hall of Merit. I'll have to dig up his thread and post some of this.

 
At 5:27 PM, Blogger Chone Smith said...

DCJ, I found the discrepancy. I generated the absolute runs based on stats with pitcher hitting removed. Then, for runs above average, I compared to the plain old league RPG.

So every player in a league w/o DH gets a bonus. Correcting for this, Davis is +61 runs, pretty close to B-ref. Still in the mid 60's for WAR. Not bad at all.

 
At 7:14 PM, Blogger Colin Wyers said...

Chone, Retrosheet publishes "box score event files" for games where they don't have full PBP accounts yet. Ideally what you would do is apply your LWTS by base-out to games where you have full PBP and then regular LWTS to the other games. If that interested you, of course. I prefer standard LWTS.

--CW

 
At 9:55 PM, Anonymous Anonymous said...

Chone,

Thanks for checking that out about Davis. Good thing it was easy to correct.

About regular LWTS versus base-out LWTS, certainly which one to use is a matter of personal preference. The reason I objected before was that I thought you were trying to have it both ways with respect to clutch hitting.*

Either you give credit for clutch hitting, or you don't. If you do, you should be using base-out LWTS. If you don't, I thought, your runs should add up to the team's total BaseRuns (or other such formula) rather than actual runs scored.

Since then I've changed my mind, for reasons that would take a long time to explain, and I think your method makes sense from a philosophical point of view.

I still would prefer it if you used BaseRuns instead of actual runs scored. As things stand, you are giving credit to teams with "efficient" offenses, but you are not giving credit to teams that beat their pythag. I personally favor giving neither one credit (like WARP) or giving both credit (like Win Shares). But it's your stat, and it makes sense on its own terms, which is all anyone can ask for.

* I'm only talking about clutch as regards the base-out situation. I'm ignoring the inning and score differential.

 
At 10:20 PM, Anonymous Anonymous said...

More speculation on Willie Davis. I looked at the HOM thread. People seemed to agree that he falls short unless you consider 199 games in Japan at the end of his career, and even then he still probably falls short.

My guess is that the +250 runs you get for baserunning (incl. SB/CS and GIDP) and defense is way above what they estimated. Plus they may not have been accounting for the lower number of runs per win.

I wonder what Dan R has for Davis, considering that he takes all that stuff into account. (Or I think he does...)

 
At 8:56 PM, Blogger Chone Smith said...

It looks to me like Dan has Willie in the 45-50 win range (not sure which WARP column to look at). While I've got him around 65. Dan rates his base running and defense well, so that's not it. It's the position replacement level.

Dan has a CF in Willie's time at 1.3-1.4, while I'm using nearly 2.5 - 2.0 win is standard, .25 for center field, and a little extra for the NL being the better league at the time.

My position adjustments are based on recent seasons, and perhaps in the past the relative value of positions was different. Dan obviously believes this, as you can see by his shortstop ratings. Perhaps the CF back then was not 10 runs better defensively than the left fielder. Look at the 80's- with Vince Coleman, Time Raines, Gary Redus, Mookie & Willie Wilson, and RICKEY! all playing left at the same time, I'm sure the ability at each position did not have the same contrast as today's model, with Manny Pat Dunn lumbering out there.

Until I have the time to further research the position spectrum, I'll hold off on advocating Willie Davis for HOF. I do think he was a better player than he's generally given credit for though.

 
At 7:53 PM, Anonymous Anonymous said...

carb rotation diet -
catch spouse cheating -
cb affiliate blueprints -
combat the fat -
conversationalhypnosis -
conversational hypnosis -
cure angular cheilitis -
directory of ezines -
dirty talking guide -
dish tv for pc -
dog training online -
domain sales machine -
driver robot -
earth4energy -
earth 4 energy -
easy tech videos -
eatstopeat -
eat stop eat -
error killer -
error smart -
evidence eraser -
evidence smart -
fatburningfurnace -
fat burning furnace -
fatloss4idiots -
fat loss 4 idiots -
fitnessmodelprogram -
fitness model program -
fit yummy yummy -
flattenyourabs -
flatten your abs -
flat to fab -
forex derivative -
gas 4 free -

 
At 7:54 PM, Anonymous Anonymous said...

marketing on the fringe -
maternityacupressure -
maternity acupressure -
meet your sweet -
musclegainingsecrets -
muscle gaining secrets -
my online income system -
negative calorie diet -
no adware -
one minute cure -
one week marketing -
online pickup secrets -
partenon -
pc on point -
perfect optimizer -
perfect uninstaller -
pick the gender of your baby -
power cash secret -
profit lance -
publicrecordspro -
public records pro -
push button marketer -
quit smoking today -
registry easy -
registry easy download -
registry fix -
registry winner download -
reverse mobile -
reverse phone detective -
richard mackenzie direct -
rocket spanish -
rss power plus -
sem business blueprint -
silent sales machine -

 
At 7:54 PM, Anonymous Anonymous said...

profit lance -
publicrecordspro -
public records pro -
questions for couples -
quit smoking today -
ready made review sites -
reg genie -
registry easy -
registry easy download -
registry winner -
registry winner download -
retrievea lover -
reverse mobile -
richard mackenzie direct -
rocket italian -
rocket spanish -
roulette sniper -
rss ground -
secret affiliate weapon -
secrets book -
spam bully -
spyware cease -
spyware nuker -
spyware remover -
spy zooka -
starting a day care center -
tattoo fever -
team idemise -
the bad breath report -
the cb code -
thedietsolutionprogram -
the diet solution program -
the rich jerk -
the super mind evolution system -

 
At 7:54 PM, Anonymous Anonymous said...

turbulencetraining -
turbulence training -
vincedelmontefitness -
vince del monte fitness -
warp speed fat loss -
wedding speech 4u -
windo fix -
wrap candy -
zygor guides -
7day ebook -
360 fix kit -
advanced pc tweaker -
adware bot -
affiliate naire -
art of approaching -
beating adwords -
believe and manifest -
blogging in action -
body building revealed -
burthefat -
burn the fat -
carb rotation diet -
cheat your way thin -
cold sore freedom in 3 days -
conversationalhypnosis -
conversational hypnosis -
convert 2 ev -
cure for bruxism -
cure hemorrhoids -
digi cam cash -
digital media solution -
dl guard -
driver checker -
earth4energy -

 
At 7:54 PM, Anonymous Anonymous said...

earth 4 energy -
easy backup wizard -
easy member pro -
easy tv soft -
eatstopeat -
eat stop eat -
error fix -
error killer -
evidence nuker -
fap turbo -
fatburningfurnace -
fat burning furnace -
fatloss4idiots -
fat loss 4 idiots -
final uninstaller -
fitnessmodelprogram -
fitness model program -
flatten your abs -
gamers testing ground -
gov auction -
governmentregistry -
government registry -
gov records -
herbal hair solution -
homebre ware -
home job group -
homemadeenergy -
home made energy -
inteli gator -
joanas world -
joyful tomato -
kidney stone remedy -
learn digital photography now -
learn elements now -

 
At 3:37 AM, Anonymous Anonymous said...

ativan price ativan long before addiction - ativan overdose many pills

 
At 9:53 AM, Anonymous Anonymous said...

buy valium online valium 10mg vs ativan 1mg - valium drug insert

 
At 12:45 AM, Anonymous Anonymous said...

buy xanax bars online no prescription alprazolam tab 0.5mg side effects - xanax controlled substance schedule

 
At 12:14 PM, Anonymous Anonymous said...

diazepam 10mg diazepam withdrawal and seizures - valium dosage for dogs with seizures

 
At 7:35 AM, Anonymous Anonymous said...

online xanax xanax side effects libido - xanax withdrawal + valerian root

 
At 8:41 AM, Anonymous Anonymous said...

generic lorazepam pete wentz ativan overdose - ativan vertigo

 
At 8:41 PM, Anonymous Anonymous said...

lorazepam no prescription ativan side effects skin - ativan vs xanax flight anxiety

 
At 7:41 AM, Anonymous Anonymous said...

ativan pills side effects of lorazepam 1mg - ativan vicodin overdose

 
At 7:02 AM, Anonymous Anonymous said...

buy soma soma drug contraindications - carisoprodol 350 mg with alcohol

 
At 1:14 PM, Anonymous Anonymous said...

buy soma soma drug high - somanabolic muscle maximizer uk

 
At 11:07 PM, Anonymous Anonymous said...

buy soma carisoprodol 350 mg how many to get high - carisoprodol wiki

 
At 3:32 PM, Anonymous Anonymous said...

soma online carisoprodol 350 mg uses - soma intimates coupon codes

 
At 7:00 PM, Anonymous Anonymous said...

buy valium no prescription needed valium causes anxiety - valium looks like

 
At 2:51 PM, Anonymous Anonymous said...

diazepam drug valium 5mg mri - valium hiccups

 
At 5:56 AM, Anonymous Anonymous said...

valium price valium pregnancy - valium treatment

 
At 4:54 PM, Anonymous Anonymous said...

Tampa Ivan Car Drug order meridia online - buy meridia diet pills http://www.meridiaonlineorder.net/#buy-meridia-diet-pills , [url=http://www.meridiaonlineorder.net/#meridia-weight-loss ]meridia weight loss [/url]

 
At 10:11 PM, Anonymous Anonymous said...

6, [url=http://www.isotretinoinonlinerx.net/] Generic Accutane [/url] - Order Isotretinoin Online - generic isotretinoin http://www.isotretinoinonlinerx.net/ .

 

Post a Comment

<< Home