What’s the Point of BABIP?

BABIP – batting average on balls in play – is very different from many of the sports stats you’re used to seeing. Instead of trying to measure who is the best at a certain aspect of the game, it is actually used to help contextualize which players are likely over or under achieving. The only other stat I know that tries to put a number on a similar concept is KenPom’s luck stat in his college basketball rankings.

But what is BABIP and why does it help us measure luck in baseball? Here, I’ll give an intro to the stat, explain the general opinion on why it helps predict outlier performances, and discuss with supporting data to what degree it works. To start, we must understand the idea of regression to the mean.

Regression to the Mean

One of the reasons baseball is so strange is that the common statistics, batting average for example, we use to measure quality are extremely variable from year to year. The difference between the best batting average and the 20th best batting average might be as small as 15 hits. Over the course of 162 games, that is a ridiculously small number. This fact makes it difficult to judge whether or not players have actually improved from year to year. Let’s take Nick Castellanos’ 2021 stats as an example.

Castellanos is a career .280 hitter. That’s certainly good, though decidedly not batting champion good. This year, though, he hit .309! Unfortunately for the Reds, his contract was up. Does he deserve to get paid like a .309 hitter? For reference, other guys who hit around .309 this year include Vlad Jr., Bryce Harper, and Juan Soto. If we’re going to pay Castellanos like a .309 hitter, he is going to get a humongous contract.


Almost everyone will tell you that Castellanos is not going to be paid like those three other players are. In fact, Castellanos hitting .309 is entirely within the realm of normal variation for a career .280 hitter. Putting some concrete numbers on things using confidence intervals, if you take just eight different guys who are true .280 hitters and give them each 500 at bats, the odds are that one of them will hit at least .309 on the year.

The common consensus around the league is that Castellanos is probably not going to hit .309 again next year. In fact, he will in all likelihood hit much closer to his career mark around .280. Thinking in this way is what we mean by regression to the mean.

So, if a player’s batting average changes from one year to another, how can we tell if it is due to normal random variations or if it is due to the player somehow improving? Enter BABIP.

What is BABIP?

BABIP is batting average on balls in play. Any time you make contact with the ball and the ball drops somewhere between the foul lines but in front of the outfield fence, that is what we are talking about. If you look at all the balls that enter the field of play and compute the percentage of those balls which turn into hits, that tells you batting average on balls in play. Formulaically, this is given by BABIP = \frac{H - HR}{AB-HR -K + SF} where SF is sacrifice flies.

Looking at the denominator, this is exactly the total number of balls that enter the field of play. Start by taking at-bats and subtract strikeouts and home runs. Then, add back on sacrifice flies because those aren’t counted as at bats. The numerator is just the number of singles, doubles, and triples.

The more important topic of conversation is this: why does BABIP help identify regression candidates? The central thesis is this: once you’ve put the ball into play, whether or not you record a hit is typically due to factors outside of your control. That is, once the ball is in the field of play, the defense controls the outcome.

Therefore, if your BABIP is above league average, then you were lucky because the defense you were up against allowed you to get more hits than you should have gotten. The odds are, the next year, that stat will regress towards the mean and you should get fewer hits next year. Higher than average BABIPs indicate that you were luckier than normal so that your batting average is likely to decrease next year.

At least, that’s the theory.

Does it work?

The entire point of using BABIP is to see how much luck went into a player’s stats. The problem, though, is that it doesn’t only measure. The idea is that BABIP being above league average means the player was lucky while a below league average value means the player was unlucky. However, some hitters might actually be predisposed to having higher or lower batting averages on balls in play than others. For example:

  1. Players that are faster will be more likely to beat-out close plays at the bag. These types of players should have above average BABIP from year-to-year.
  2. Players that hit overwhelmingly to one side of the field are easier to employ the shift against. This leads to an opportunity for advantageous defensive positioning leading to more outs on balls those hitters put into play.
  3. Players hitting more ground balls than fly balls generally get more hits because ground balls lead to fewer outs than line drives and pop-ups. Players with larger ground ball to fly ball ratios generally have a larger batting average on balls in play.

Because of these considerations, BABIP is not nearly as random as many analysts would lead you to believe. BABIP is often framed as “if a player’s BABIP is above league-average, they’ll do worse next year but if it’s below average they’ll be better next year!”. But is that really true?

The figure below shows a scatter plot of BABIP in 2018 versus 2019 for players who recorded at least 200 plate appearances in both seasons.

relationship between 2018 babip and 2019 BABIP

Yes, there is a lot of noise here, but there is also a clear trend. Those players who had a larger BABIP in 2018 also tended to be above average in 2019 as well. Fitting a least squares regression line to this data, gives an R-squared value of about 0.16. In statistical parlance, (the following sentence is classic stats 101), this means that only 16% of the variance in 2019 BABIP is explained by a player’s 2018 BABIP. If you are like any of my students, you have no idea what this sentence means, so let’s try again.

Using a linear model, knowing a player’s 2018 BABIP lets you predict their 2019 value with 16% less prediction error than if you did not have access to this prior data. If BABIP were truly random, then knowing the 2018 value wouldn’t help you predict the 2019 value at all. This would correspond to an R-squared value of 0 because the average prediction error shouldn’t be decreased .

While this R-squared value (0.16) does not indicate a strong relationship, it is large enough to suggest that a relationship does exist. That is, BABIP for hitters is correlated from year to year. Another statistical tool, an F-test for the significance of the slope, returns a p-value of 0.01 which indicates that this correlation is statistically significant.

Away from the statistics and back to the baseball world, all this data indicates that our hypothesis that certain players may be predisposed to having consistently better or worse BABIP than league average.

Therefore, if you want to use batting average on balls in play to help identify players that have over or underperformed, more care needs to be taken. In fact, instead of comparing a player’s BABIP to league-average, you should compare it to that player’s historical average to see if they have been over or under achieving in a season.

BABIP for Pitchers

Pitchers are an entirely different story. Those attributes that allow hitters to consistently maintain above or below average BABIP year over year do not really exist for pitchers. Certainly some pitchers might have different ground ball to fly ball ratios, but the other effects – spreading the ball around and smarter base running – are outside of a pitcher’s control. The net effect is that BABIP for pitchers is as claimed: almost totally random.

Plotting pitchers 2018 v. 2019 BABIP gives us the scatter plot shown below. This is an almost perfect representation of what unstructured noise looks like in a data set.

pitchers BABIPs are nearly random

The R-squared value for this relationship is about 0.0005. No kidding, that is tiny. This means that a pitcher’s 2018 BABIP provides almost no benefit to predicting their 2019 BABIP. In turn, this means that BABIP is truly random for pitchers from year to year and can be used to identify regression candidates

The implications are backwards for pitchers as for hitters, though. A pitcher with a significantly above average BABIP was unlucky because hitters got more hits against them than they should have. Thus a pitcher with significantly above average BABIP is likely to perform better in the following year. The opposite relationship is also true: a pitcher with a below average BABIP is likely to be worse the following year.

Conclusions

BABIP is certainly an interesting tool. It is pretty rare to have statistics that are able to directly measure players who benefited from pure luck. For hitters, BABIP is able to do this with some reasonable degree of accuracy, but because there is some relationship between batting average on balls in play from year to year, care needs to be taken when identifying potential regressors. For pitchers, though, BABIP does appear truly random. Thus, using BABIP to identify over and under performing pitchers should lead to fairly accurate results.

If using BABIP to identify over and under achieving hitters, one should compare a hitter’s BABIP to their career average. However, for pitchers BABIP should be compared against league average. Then, with a healthy does of perspective added in for hitters to see if their change in BABIP can be explained in a way other than luck, you should be able to fairly confidently identify players ripe for regression.

For more articles studying advanced metrics, see some of our other posts below!

To subscribe to receive emails updates when new articles are posted, please enter your email into the form below.

One Reply to “What’s the Point of BABIP?”

Comments are closed.