+

wES Rate: Creating an Improved CSW Rate

Not all earned strikes are created equal.

Author’s Note: Prior to writing this article, I failed to do adequate research and did not recognize that Jesse Roche had covered a very similar topic in his October 2021 article titled Adjusted Called Strikes: Not All Strikes Are Created Equal.  While our work isn’t a perfect mirror to one another’s, I want to apologize to Jesse and the folks at Baseball Prospectus for this error, and highly recommend that you check out the excellent work that they did.


As it stands today, CSW rate is one of the most commonly referenced statistics in the online baseball community. Coined by our very own Nick Pollack, and formalized in an article by Alex Fast (with help from Colin Charles) in 2019, it has become arguably the highest-favored snapshot of pitcher performance.

Many metrics exist in our little corner of the internet to help us gain insight into player performance. For analytics-inclined folks like myself, we know that most vary from cumbersome to outright overwhelming to calculate for folks who may not be as in tune with that side of the game. CSW’s elegance comes from its simplicity and accessibility. Three numbers, easily accessible through BaseballSavant, are all you need to calculate CSW:

  1. Called Strikes
  2. Whiffs
  3. Total Pitches Thrown

While CSW is already an excellent metric, I found myself having wandering thoughts about it over the offseason. The part that I found myself fixated on was the idea that all called strikes and whiffswhich I’ll collectively be dubbing “earned strikes” for the rest of this pieceshould not be valued equally.

Before doing any Baseball Savant searches or hard research, my hunch was that if you broke down earned strikes by the count they were thrown in, the CSW rate would be higher in counts with no strikes, and decrease as the batter was closer to striking out.

To put it simply, as the count became more competitive, CSW rates would decrease.

Using two of an amateur stats junkie’s favorite toolsBaseball Savant search and Microsoft Exceland the 2021 season as a test, I calculated the league-wide CSW rate for last year, along with a breakdown of CSW rate in each count:

CSW by Count – 2021

It’s abundantly clear that earned strikes become more scarce as the batters get closer to striking out.

With this in mind, how do we determine values for a count-dependent earned strike? I was starting to get a bit out of my element, so I enlisted the help of Jeff Nicholas, data analyst extraordinaire here at Pitcher List, and dove in.

 

Balancing the Scales

 

Our first stab at assigning values to earned strikes by count was to calculate them using the expected run value. The chasm between each of these values was so wideparticularly once we got to 2-strike countsthat it rapidly became clear we weren’t on the right track. So, that idea was scrapped as quickly as it was adopted. In our second attempt, Jeff suggested calculating the Z-scores for CSW rate in each count, adding those Z-scores to 1, and using that sum as the value for an earned strike for each count. Here are the results:

Earned Strike Values by Count – 2021

This looked much better, as it better reflected the value of count-dependent earned strikes in relation to the CSW rate across our entire sample. But we had to see if our work bore any fruit. For that, we went back to where it all began: Alex Fast’s original article introducing CSW rate.

 

More Descriptive than CSW

 

One of the first high-correlation comparisons Alex found was between CSW rate and strikeout rate. This seemed like the most logical place to push towards, so I asked Jeff to generate scatter plots to determine the strength of correlation of both CSW and wES rates with strikeout rate. We wanted to stick to pitchers with a high volume of work (starters/high-usage relievers), so we set a minimum of 1000 pitches thrown to “qualify” for our test. The strength of correlation between two data sets is found by using a linear regression formula to calculate r2. This formula will spit out a number between 0 and 1; the larger the number, the stronger the correlation.

Here’s what we came up with:

 

 

While it’s intuitive that K% would correlate well with both of these metrics, the fact that wES rate handily outpaces classic CSW means two things:

  1. wES rate is definitively more descriptive than CSW rate; and
  2. Jeff and I are absolutely on the right track.

You may ask why didn’t we look at swinging-strike rate or Whiff rate like Alex and Colin did. The reason is that they already proved with their work that CSW is consistently better than both of those metrics at describing a pitcher’s success. The only thing we needed to prove with wES rate is that it’s better than CSW, and we’ve done that.

But what about a predictive standpoint?

 

More Predictive than CSW

 

Alex and Colin also found strong correlations between CSW and SIERA, one of the most prominent ERA estimators. This is encouraging because it put them on a path to determining CSW rate’s predictiveness relative to other commonly used stats at the time. Following the trail that they cut through the jungle for us, Jeff and I pushed forward in the same manner as we did with K%. Here’s how wES rate stacked up against CSW in correlation to SIERA for the 2021 season:

 

 

Once again, wES rate emerges with a stronger correlation than its predecessor. I was vibrating with excitement at this point, but our work wasn’t quite done; we still had one more test to run to determine the relative predictiveness of wES rate. I’ll let Alex explain it:

 

“I figured the best way to determine whether CSW had predictive qualities was to take CSW rates from previous seasons and compare them to the following year’s SIERA to see if there was a change. For example, if a pitcher posted a 30% CSW rate in 2016, and a 38% CSW rate in 2017, we would hope to see their SIERA go down from the ’16 to ’17 season.”

 

Following the path that Alex and Colin cut for us one last time, I asked Jeff to find every instance of a pitcher throwing at least 1000 pitches in consecutive seasons during the Statcast era (threshold dropped to 500 pitches for the pandemic shortened 2020 season), measure the difference in SIERA between year one and year two, and see if those changes more strongly correlated with changes in CSW rate or wES rate. Here’s what we came up with:

 

Differences in Year 1 to Year 2 SIERA compared to wES% (left) and CSW% (right) from 2015-2021

 

Bingo! That’s a meaningful jump in predictiveness!

 

Wrapping Up wES Rate

 

How quickly does wES rate stabilize in-season?

This is probably one of the more exciting discoveries that we made with wES rate (shout out to Alex Fast for asking about it when I had him review this piece). Jeff ran the numbers using this code from Jonah Pemstein at Fangraphs, and CSW stabilizes at ~700 pitches thrown, while wES stabilizes at ~570 pitches (chart below). Another improvement!

Stabilization comparison between wES% and CSW% – Target Alpha of 0.7

 

Is there going to be an easy way to track this throughout the season?

It’s above my current skill set, but I will eventually be working on a leaderboard for wES that will probably live on Tableau or some such platform. And hey, maybe I can convince Nick and the dev team here at PL to get it on the player pages if people like it enough!

Update 1/10/2022: Here is the full 2022 leaderboard! I set the minimum pitches threshold to 1200 to do two things;

  1. Weed out most of the relievers (might talk about them in a different article altogether); and
  2. Show off all the folks who got meaningful opportunities as starters in 2022.

Enjoy!

2022 wES Rate Leaderboard – Updated 1/10/2023

Special thanks to Jeff Nicholas

Photos by Icon Sportswire | Adapted by Justin Redler (@reldernitsuj on Twitter)

Jordan White

Based in Milwaukee, WI, Jordan has a love of tabletop gaming, voice-over, and plant-based cooking/baking. You're encouraged to follow him on Twitter (@BuntSingles), where you will more often get photos of his food than takes on baseball.

4 responses to “wES Rate: Creating an Improved CSW Rate”

  1. Yants says:

    Haven’t fully read the final product but wanted to be the first commenter. Great work, Jor Boar. Excited to dig in.

    • Jordan White says:

      Love you buddy!

      • michael says:

        great article jordan, really superb. 2 questions. adding 1 to z-scores seems to change their relative rates (i think) but i assume it must be necessary – could you give a quick explanation as to why? second, in terms of predictive value is this year’s wES predict next year’s SIERA better than this year’s SIERA; as well, does including both SIERA and wES from this predict SIERA better than either one of those variables independently?

  2. Ron says:

    Huge. Great work. This has been in the back of my head for 2 years as I watch batters often just take a 2-0 or 3-0 pitch depending on the situation where any schmo is basically guaranteed a 100% CS on that pitch if they can get it in the zone. Meanwhile on 1-2, hitters almost will never let a CS get by.

    I love where this is going. Once you have the SQL in the bakground powering it, the next logical variable is expanding your matrix beyond count and adding factors such as score, outs, and runners on base. All factor into batter decisions to swing or not and could potentially increase your r2.

Leave a Reply to Jordan White Cancel reply

Your email address will not be published. Required fields are marked *

Account / Login