There is a palpable divide amongst baseball fans these days about the value of advanced statistics. Within this conflict, fans find themselves in one of two camps: either they fully embrace the use of sabermetrics, or they are unquestionably against them.
But there are others, notably writers whose job it is to be up-to-date on all the latest information about baseball, who find themselves somewhere in the middle. They want to use any kind of statistic or nuance that enhances their understanding of the game they love, but they are also hesitant to simply abandon century-old norms for relatively new ideas. That’s just human nature.
A couple of days ago, my former colleague and all-around good guy Mike Hurley, a sportswriter for CBS Boston, wrote a piece called, “Time For Baseball’s WAR Supporters To Tone Down The Arrogance.” The column was in response to an article by Sam Miller on ESPN.com entitled, “WAR is the answer.”
WAR, or “wins above replacement,” is arguably the poster child for sabermetrics, and has been at the forefront of the old- vs. new-school stats debate. It has been a fairly mainstream stat for the past few years, but was raised to prominence during the 2012 American League Most Valuable Player race. WAR supporters pined for rookie sensation Mike Trout, who led all of baseball with 10.7 rWAR (Baseball Reference), while traditionalists sided with Miguel Cabrera, who won the AL Triple Crown despite finishing just fourth in the league with 6.9 rWAR. Cabrera ended up winning the MVP, and pretty easily at that.
The overall discourse between fans, bloggers, columnists and the general baseball community could lightly be described as “heated” during the months surrounding the MVP vote. At times, it was downright childish.
And that’s where I agree with Hurley, first and foremost. The arrogance – on both sides of the argument – could definitely use some “toning down,” as Hurley says. I know that I’m guilty of it, as someone who supports advanced statistics. I often find myself riled up when I find a weak spot in an argument by someone on the other side, and have too often taken the low road of mockery, rather than calmly trying to advance my point and hope for the best.
But I think that’s a separate point for another day, probably something a therapist could write about more eloquently than me or any other sportswriter. Emotions are a part of arguments; that’s nothing new.
What I really want to address in this piece is a point that I feel is too often lacking from the overall argument about not just the value of WAR, but more specifically, the rationale behind it.
One of my favorite blog posts in the history of the Internet – and one that I always send people who debate the merits of WAR to – is called “Everyone has their own WAR,” by Tom Tango. The post was written in response to an interview with ESPN.com’s Mark Simon on Seamheads.com.
Tango’s article is short, but it provides an eye-opening rationale to the basis behind WAR. The idea is that multiple people can rate baseball players using any kind of metrics they choose, including intangibles such as “heart” or “leadership.” But at the end of the day, everyone ends up at one final value for each player, and those players are ranked based on which is better than the other.
We all come up with our “single number”, even though we kick and scream that we shouldn’t come up with a single number. If one guy argues that Felix is better than Lincecum, and the other argues the opposite, then guess what: they’ve each “smushed” a bunch of parameters, considerations and gut feelings to get to their final opinion.
No one is telling you not to overweight or underweight strikeouts or HR. But a system requires you to spell out the rules for weighting, and apply that consistently to everyone.
The one good thing about the case-by-case basis is that it forces you to think about parameters. You’d like to ding Manny Ramirez a little, you’d like to up Jeter a little. So, you have to create a “heart” parameter. And that’s perfectly fine! Just spell it out that that’s what you are doing. And tell us how much you are giving to each player for heart. I have no problem with giving out wins for heart, over-and-above whatever his actual performance tells us. Just spell it out and be consistent.
The point is that in order to give credence to qualities such as “heart,” there needs to be a baseline for what they are worth, and there needs to be comparative information about every player in the entire league in order for it to be properly tested.
For instance, David Ortiz has been labeled by many as one of the most “clutch” hitters in all of baseball, thanks in large part to his heroic performance in the 2004 American League Championship Series. There’s no doubt that Ortiz came up huge in that series, and is one of the main reasons why the Red Sox ended up as World Series champs that year.
In fact, there are several other instances in which Ortiz has been clutch, including his string of walk-off home runs in 2006. But does this mean that Ortiz is innately clutch – that he has an internal quality that allows him to come up big in important situations?
I don’t claim to know the answer to that question, but I won’t totally dismiss the idea that someone can be born with or develop an ability to thrive in tough spots; I also won’t embrace it as truth, because there’s no documented evidence to prove it exists.
More importantly, though, I don’t think that I am in a position to say that Ortiz is a more clutch hitter than anyone else. Think about it: Ortiz did come up with several big hits in several key situations, but was it not by incredible fortune that he even had the opportunity to do so? Is Ortiz more innately clutch than, say, Adrian Gonzalez, who has been a wonderful hitter throughout his career but hasn’t had many chances to bat in the playoffs?
Again, I don’t know the answer to that question. All we know about things like “clutch” or “heart” is that they are completely subjective qualities – even someone who recognizes their value will admit that.
Until we have the ability to accurately measure and document these intangibles for every single player in the league, I don’t feel comfortable using them in my overall analysis. That is why I prefer WAR.
Of course, WAR is not perfect. The defensive metrics used in its calculations are far from reliable, though they are coming along. And WAR also does not incorporate “win probability added,” which I tend to feel is somewhat relevant in analyzing true value (WPA alone is a useful stat, though).
But that does not mean that WAR isn’t the best we’ve got so far. In fact, according to Miller’s article, “In 2012 the correlation between Baseball Prospectus’ WAR and team victories was 0.86 (where 1.0 would have meant a perfect correlation).”
I only took one year of AP Statistics in high school, but even I remember that that’s a pretty damn strong correlation, especially for a statistic that many call “unreliable.” It’s not enough to embrace the stat as gospel, but it is enough to raise an eyebrow when there is a vast difference between two players’ WARs (i.e., Trout and Cabrera in 2012).
I guess the point I’m really trying to drive home is that while WAR as we know it is based on tested calculations of numbers – ones that I and many others trust – the true spirit of the stat is to assess each player’s overall value. It’s something fans have argued about for decades.
And there’s still plenty of room for debate. But if you are to question WAR’s merits, I only ask you this: What system do you have that has been proven to be better?