Author’s Note: Prior to writing this article, I failed to do adequate research and did not recognize that Jesse Roche had covered a very similar topic in his October 2021 article titled Adjusted Called Strikes: Not All Strikes Are Created Equal. While our work isn’t a perfect mirror to one another’s, I want to apologize to Jesse and the folks at Baseball Prospectus for this error, and highly recommend that you check out the excellent work that they did.
As it stands today, CSW rate is one of the most commonly referenced statistics in the online baseball community. Coined by our very own Nick Pollack, and formalized in an article by Alex Fast (with help from Colin Charles) in 2019, it has become arguably the highest-favored snapshot of pitcher performance.
Many metrics exist in our little corner of the internet to help us gain insight into player performance. For analytics-inclined folks like myself, we know that most vary from cumbersome to outright overwhelming to calculate for folks who may not be as in tune with that side of the game. CSW’s elegance comes from its simplicity and accessibility. Three numbers, easily accessible through BaseballSavant, are all you need to calculate CSW:
- Called Strikes
- Total Pitches Thrown
While CSW is already an excellent metric, I found myself having wandering thoughts about it over the offseason. The part that I found myself fixated on was the idea that all called strikes and whiffs—which I’ll collectively be dubbing “earned strikes” for the rest of this piece—should not be valued equally.
Before doing any Baseball Savant searches or hard research, my hunch was that if you broke down earned strikes by the count they were thrown in, the CSW rate would be higher in counts with no strikes, and decrease as the batter was closer to striking out.
To put it simply, as the count became more competitive, CSW rates would decrease.
Using two of an amateur stats junkie’s favorite tools—Baseball Savant search and Microsoft Excel—and the 2021 season as a test, I calculated the league-wide CSW rate for last year, along with a breakdown of CSW rate in each count:
It’s abundantly clear that earned strikes become more scarce as the batters get closer to striking out.
With this in mind, how do we determine values for a count-dependent earned strike? I was starting to get a bit out of my element, so I enlisted the help of Jeff Nicholas, data analyst extraordinaire here at Pitcher List, and dove in.
Balancing the Scales
Our first stab at assigning values to earned strikes by count was to calculate them using the expected run value. The chasm between each of these values was so wide—particularly once we got to 2-strike counts—that it rapidly became clear we weren’t on the right track. So, that idea was scrapped as quickly as it was adopted. In our second attempt, Jeff suggested calculating the Z-scores for CSW rate in each count, adding those Z-scores to 1, and using that sum as the value for an earned strike for each count. Here are the results:
This looked much better, as it better reflected the value of count-dependent earned strikes in relation to the CSW rate across our entire sample. But we had to see if our work bore any fruit. For that, we went back to where it all began: Alex Fast’s original article introducing CSW rate.
More Descriptive than CSW
One of the first high-correlation comparisons Alex found was between CSW rate and strikeout rate. This seemed like the most logical place to push towards, so I asked Jeff to generate scatter plots to determine the strength of correlation of both CSW and wES rates with strikeout rate. We wanted to stick to pitchers with a high volume of work (starters/high-usage relievers), so we set a minimum of 1000 pitches thrown to “qualify” for our test. The strength of correlation between two data sets is found by using a linear regression formula to calculate r2. This formula will spit out a number between 0 and 1; the larger the number, the stronger the correlation.
Here’s what we came up with:
While it’s intuitive that K% would correlate well with both of these metrics, the fact that wES rate handily outpaces classic CSW means two things:
- wES rate is definitively more descriptive than CSW rate; and
- Jeff and I are absolutely on the right track.
You may ask why didn’t we look at swinging-strike rate or Whiff rate like Alex and Colin did. The reason is that they already proved with their work that CSW is consistently better than both of those metrics at describing a pitcher’s success. The only thing we needed to prove with wES rate is that it’s better than CSW, and we’ve done that.
But what about a predictive standpoint?
More Predictive than CSW
Alex and Colin also found strong correlations between CSW and SIERA, one of the most prominent ERA estimators. This is encouraging because it put them on a path to determining CSW rate’s predictiveness relative to other commonly used stats at the time. Following the trail that they cut through the jungle for us, Jeff and I pushed forward in the same manner as we did with K%. Here’s how wES rate stacked up against CSW in correlation to SIERA for the 2021 season:
Once again, wES rate emerges with a stronger correlation than its predecessor. I was vibrating with excitement at this point, but our work wasn’t quite done; we still had one more test to run to determine the relative predictiveness of wES rate. I’ll let Alex explain it:
“I figured the best way to determine whether CSW had predictive qualities was to take CSW rates from previous seasons and compare them to the following year’s SIERA to see if there was a change. For example, if a pitcher posted a 30% CSW rate in 2016, and a 38% CSW rate in 2017, we would hope to see their SIERA go down from the ’16 to ’17 season.”
Following the path that Alex and Colin cut for us one last time, I asked Jeff to find every instance of a pitcher throwing at least 1000 pitches in consecutive seasons during the Statcast era (threshold dropped to 500 pitches for the pandemic shortened 2020 season), measure the difference in SIERA between year one and year two, and see if those changes more strongly correlated with changes in CSW rate or wES rate. Here’s what we came up with:
Differences in Year 1 to Year 2 SIERA compared to wES% (left) and CSW% (right) from 2015-2021
Bingo! That’s a meaningful jump in predictiveness!
Wrapping Up wES Rate
How quickly does wES rate stabilize in-season?
This is probably one of the more exciting discoveries that we made with wES rate (shout out to Alex Fast for asking about it when I had him review this piece). Jeff ran the numbers using this code from Jonah Pemstein at Fangraphs, and CSW stabilizes at ~700 pitches thrown, while wES stabilizes at ~570 pitches (chart below). Another improvement!
Stabilization comparison between wES% and CSW% – Target Alpha of 0.7
Is there going to be an easy way to track this throughout the season?
It’s above my current skill set, but I will eventually be working on a leaderboard for wES that will probably live on Tableau or some such platform. And hey, maybe I can convince Nick and the dev team here at PL to get it on the player pages if people like it enough!
In the meantime, here’s a 2022 wES leaderboard (minimum 570 pitches thrown) for you to check out. Until I get something more official going, I’ll try to come back and update this table once per week. Any further updates will likely be posted on my Twitter page.
Special thanks to Jeff Nicholas