BABIP vs. Expected Stats

Is it time to move on from BABIP?

One of the first advanced metrics I was introduced to was BABIP (Batting Average on Balls in Play). I was never a curmudgeon when it came to the idea of advanced analytics, but I’ve got to tell you, I didn’t think much of this particular stat. How could this be cited so frequently? It doesn’t take the quality of contact into account at all. To me, it suggests that a player’s batting average is simply based on how often he put the ball in play. It is strongly inversely proportional to how much he strikes out.

 

BABIP

 

There is a little more to the BABIP equation than strike outs and contact rates, but not much.

BABIP=H-HR/AB-K-HR+SF

Over a long period of time, a relatively high or low BABIP can tell you something about a player’s typical quality of contact and his sprint speed. It takes about 820 Balls in Play for BABIP to stabilize for hitters and 2000 BIP for pitchers. This is roughly two seasons for a regular and four seasons for a full-time starting pitcher.

In hindsight, it is easy to see why Tony Gwynn had a .341 career BABIP, or Wade Boggs had a .344 career BABIP. However, at what point during their careers would those high numbers have been accepted as who they were, and not merely “He is getting lucky”? What happens when there is a sudden change in an established player’s BABIP? Well, he may very well be getting lucky or unlucky and you can expect his BABIP to return to his career norms. But what happens if he made fundamental changes to his swing or approach? Maybe his bat has slowed after another offseason of aging and reached a tipping point. Perhaps there is an underlying injury. And what about players that are new to the league? How do you evaluate their early BABIPs? Do you just assume they will regress to close to the league average of .300? Many questions and few answers.

Relying only on BABIP will have you guessing if it is just luck, or trying to play hitting coach while examining mechanics. Trying to contextualize luck is impractical while the latter is something few of us are capable of doing properly. Using underlying numbers is far less exclusionary and requires no swing mechanics knowledge (luckily for me).

 

xBA and xBABIP

 

Fortunately, xBA (Expected Batting Average) and xBABIP (Expected Batting Average on Balls in Play) exist. I am not introducing anything new here. xBA is fairly widely accepted by analytics-lovers. xBABIP is less well known and less accessible, but it is also an option. Read Mike Podhorzer’s piece to better understand it, but it takes batted ball types and shifts to calculate an expected BABIP based on previous batted balls of those types.

Using Statcast, xBA takes exit velocity, launch angle, and the player’s sprint speed (on topped or weakly hit balls) into account on a batted ball. It then compares similarly hit balls in the Statcast era and assigns it an expected batting average based on the average outcomes of those past batted balls. Like BABIP, it attempts to remove the quality of defense from the equation. It is similar to the Hit Probability metric that was available in 2017-2018. However, xBA is presented on a batting average scale, making it more intuitive for baseball fans. Hit Probability also did not take sprint speed into account.

All of this is not to disparage the importance of BABIP in the past. It is beautiful in its simplicity. We stand on the backs of giants. I am 100% certain there will come a time when there are better options than xBA or xBABIP in their current forms. But in the present, I believe expected stats have made the use of BABIP by itself obsolete.

 

Comparison

 

As I mentioned above it takes a long time for BABIP to stabilize. Jonathan Judge’s post does not look at BABIP, and looks at pre-sprint speed xBA, but found that the Year-over-Year correlations of xBA and BA are .46 and .32, respectively. Keep in mind that this article looked at the pitching side of things.

I wanted to examine the current version of xBA of 2019 and 2020, as well as BABIP and BA for the same period. Unfortunately, I did not have much data to go off of so had to get a bit creative. I checked to see how xBA, BABIP, and BA correlated with themselves (i.e. “stickiness”) from the 1st half of 2019 to the 2nd half of 2019, the 2nd half of 2019 to all of 2020, and finally all of 2019 to all of 2020. I did this for both pitchers and hitters. Numbers in red have a p-value over .05.

 

YoY Correlation-Pitchers

 

YoY Correlation-Hitters

 

Due to the small sample size, I wouldn’t put a ton of stock in these results, but they do reinforce the stickiness of xBA and the lack thereof for BABIP.

Looking at the difference between xBA and BA is a better indicator of how lucky or unlucky a player may have been than BABIP. An xBA that is in line with a BA suggests that the outcome was earned.

This post is not meant to be a deep dive into any individual player, but Donovan Solano and Carlos Santana can serve as examples.

Solano had a fantastic 2020 season. He finished 11th among qualified hitters with an AVG of .321. However, he finished 2nd with a .396 BABIP. So he must have been very fortunate right? Well, he finished in the 81st percentile with a .281 xBA. In 2019 he may have flown under the radar some with only 228 PA, but he finished with a .409 BABIP, .330 BA, and .321 xBA. This suggests that while a BA drop can be expected, it may not be a significant one. If we had just looked at his BABIP we might have assumed a significant BA reduction was in order.

In 2020 Santana finished 3rd from the bottom with a .212 BABIP. He had a .199 BA and .253 xBA, which was in the 52 percentile. This xBA-BA of -.54 was the 3rd largest negative difference of qualified hitters. His xBA was only slightly below his career xBA of .261. So it looks like even though he is aging, he was also quite unfortunate in 2020.

Short and simple. Part of me thinks that the continued use of BABIP has to do with how it rolls off the tongue, but there are better options out there. The Farmer’s Almanac may have been a breakthrough at one time, but you wouldn’t use it to predict the weather now, would you?

 

Photos by Andy Lewis and Nick Wosika/Icon Sportswire & Jordan Rowland/Unsplash | Adapted by Justin Redler (@reldernitsuj on Twitter)

Andrew Krutz

Andrew writes for Pitcher List and is a lifelong New York Yankees fan. During the warmer months he can be found playing vintage baseball in the Catskill Mountains of Upstate New York.

  • Avatar Ryan says:

    Excellent article! I have an example of BAPIP and its uselessness. Lindor’s BAPIP was horrible in the minors and the first half of his rookie season … cause he was hitting everything into the ground ten feet in front of the plate. Then… he joined the launch angle revolution and the rest is history. So in conclusion BAPIP is useless but becomes awesome when you pair it with analysis regarding injuries and hitting mechanics.

    My favourite BAPIP analysis for 2021 – Paul Goldschmidt. He talked about his bothersome elbow for two spring trainings in a row and just had surgery on it. 2020 BAPIP says he can still hit, but the elbow injury says he was sapped of power. This dude is primed for one more amazing season and is one of the best value plays in drafts.

  • Andrew Krutz Andrew Krutz says:

    Thank you. I agree. Without context BABIP doesn’t tell us much. If you pair it with other metrics it can provide value. Assuming any outcome will regress closer to the mean just because it is high or low is not a good idea.

  • Avatar Ryan says:

    Attempting to validate if a player will “Regress closer to the mean” actually serves two roles. This article focuses on using BAPIP to validate whether or not the law of averages plays out.

    However there is a flip side and a secondary role of BAPIP analysis. Fantasy baseball is about finishing first. While guys who positively regress to their averages is always good for finding draft value in fantasy, to win you need first place, you need the players who play the best in the league. You need to find the players who will blow out their norms (BAPIP), set the league on fire, and take their game to a whole new level.

    Contrasting rookie Lindor versus 2020 Lindor helps illustrate this. With rookie Lindor BAPIP coupled with hitting data was helpful to hunt for a breakout. With 2020 Lindor BAPIP coupled with hitting data will be helpful to determine if he will return to career norms in 2021.

    Rookie Brandon Lowe was another great example of using BAPIP to hunt for a breakout – high BAPIP, hitting at everything, pundits felt it was unsustainable. What if he had adapted to the league and he was just getting good pitches with his new hitting strategy? What if he just did not have the chance in a limited rookie sample size to play through the league adjusting to him and prove he in turn could adjust off other stuff? We know he looked great in 2020. Looking forward when predicting 2021 Brandon Lowe, we also know he cratered in the 2020 playoffs, suggesting pitchers leaned into a secondary wave of adjustments- so the question now is how will he adjust again.

    These are two examples of how both high and low BAPIP can be leveraged to find breakouts, not just regression to the mean.

  • Account / Login
    >