ALCS Game 5: New York Yankees at Los Angeles Anaheim

For those not following the sabermetric updates Baseball Prospectus has attempted to create a new pitching statistic that improves on all those before it.  This new stat is called SIERA or Skill-Interactive Earned Run Average.  This takes the work of Nate Silver who created QERA to attempt to fix the combined issues in FIP and xFIP.  So far the sabermetric community has put in lots of work reviewing, but what are the early returns?

Matt Schwartz and Eric Seidman are the main authors on this work and attempt to describe first why this new stat is needed.  The biggest reason is that while FIP, xFIP and QERA are all very instructional they do have a problem.  That problem lays in a potential weighting of the three factors pitchers control strike outs, walks and groundball rate.  In this case QERA is the worst for that failing as stated here:

QERA has another problem of its own, in that GB% is really GB/Ball in Play (or, GB/BIP), while BB% and K% are measured per batters faced (SO/PA and BB/PA).

FIP and xFIP were attempted to be improved by QERA in that it was non-linear and accounted for the more base runners a pitcher allowed the higher a percentage that would score.  So does SIERA effectively improve this and if so will it replace them?

Perhaps first we should review the six assumptions that SIERA is attempting to add into the calculation.

  1. Allows for the fact that a high ground-ball rate is more useful to pitchers who walk more batters, due to the potential that double plays wipe away runners.
  2. Allows for the fact that a low fly-ball rate (and therefore, a low HR rate) is less useful to pitchers who strike out a lot of batters (e.g. Johan Santana‘s FIP tends to be higher than his ERA because the former treats all HR the same, even though Santana’s skill set portends this bombs allowed will usually be solo shots).
  3. Allows for the fact that adding strikeouts is more useful when you don’t strike out many guys to begin with, since more runners get stranded.
  4. Allows for the fact that adding ground balls is more useful when you already allow a lot of ground balls because there are frequently runners on first.
  5. Corrects for the fact that QERA used GB/BIP instead of GB/PA (e.g. Joel Pineiro is all contact, so increasing his ground-ball rate means more ground balls than if Oliver Perez had done it, given he’s not a high contact guy).
  6. Corrects for the fact that FIP and xFIP use IP as a denominator which means that luck on balls in play changes one’s FIP.

Hopefully most of this makes sense.  You can see most of these are smaller samples and hence the reason that FIP and the rest are good, but can make certain players outliers.  Perhaps SIERA can attempt to explain someone like Javier Vazquez.

That’s all well and good, but what has the community said about SIERA so far?  Tango takes a look and tries to test the ends of the equation, which currently stands as:

SIERA = 6.262 – 18.055*(SO/PA) + 11.292*(BB/PA) – 1.721*((GBFB-PU)/PA) +10.169*((SO/PA)^2) – 7.069*(((GBFB-PU)/PA)^2) + 9.561*(SO/PA)*((GBFB-PU)/PA) – 4.027*(BB/PA)*((GBFB-PU)/PA)

So far in testing there are a lot of questions as you can see in the comments at Tango’s site. This type of discussion I’m sure will go over most readers head and for good reason. Much like a peer review in a scientific journal the first step of introducing a new statistic is gaining a consensus on what it does and that means breaking down the reasoning and implementation.

In my opinion I don’t see the huge gains in SIERA yet.  The examples given on the most recent post are questionable to explain why it is better.  There are different reasons to explain why Johan Santana does not work well for FIP, but SIERA while showing a better result might not be getting there by answering the right question.

It is going to take time to fully vet the SIERA calculation, but my first reaction is that I don’t see it having the exceptionally increased accuracy to replace FIP or xFIP.  They might be quicker and even a bit dirtier, but the purpose is well served.  It will likely serve as a solid comparison tool and perhaps answer some questions, but not be the stand alone tool for pitchers.