As the baseball world of sabermetrics has progressed, there have been an increasing number of accurate ways to correctly articulate the real value that players or given events add. At the fundamental basis of these calculations lies the Run Expectancy Matrix. If one is not familiar with such a sabermetric, it is quite simple to understand its meaning and purpose. The Matrix is represented by 24 boxes, each representing the 24 Base-Out states (the combinations of runners on base and number of outs) in baseball. For each different base-out state, the matrix shows the average number of runs that a team scores in this given scenario during the sample. By having the number of runs for each scenario, teams can correctly optimize their strategies to produce as many runs as possible.
Each year, with varying run environments and other factors, lies a different Run Expectancy Matrix. Being in the same situation may pay off more or less than the previous year – strategy has to stay dynamic and adapt to these changes. In 2022, the league faced an epidemic of low offense. Pitchers have continually proved as the dominant force, forcing more low-run games than the league generally produces. With lower runs comes the obvious lower run expectancy for most situations; a review of the current matrix is necessary to properly adjust common strategies to yield the best outcomes for teams.
Examining the Run Expectancy Matrices
Pictured above is the 2022 Run Expectancy Matrix, with each box projecting the average runs scored in every situation. To even the beginner, the trends should seem obvious. Run expectancy will tend to trend upward when more runners are on base or are farther along on base. The run expectancy will also increase when a given team has fewer outs. These assumptions could have been made with straight conventional baseball wisdom, but having data to further prove that is not such a horrible thing. These numbers are extremely meaningful to one with context. Without context though, the chart just appears to be a random sample from a tiny sliver of time. Accordingly, the following is the average Run Expectancy Matrix from 2016-2021, excluding the infamous 2020 shortened season.
2016-2019, 2021 Aggregate Matrix
By just giving this chart a simple overview, it becomes immediately apparent that more runs were scored during the latter chart than the former chart. In fact, the 2022 season only averaged 4.28 runs per game per team. The latter chart – 4.59 runs. The offensive environment seems to be significantly different, but there is a much more firm way to prove this. Specifically, paired two-sample t-Test. By comparing the means and variance among two data sets, the results can determine if two samples of the same variable are significantly different. The paired t-test conducted on the two matrices yielded a P-value less than 0.05, meaning that with 95% confidence, these two samples are significantly different. Ergo, teams likely need to adjust their strategy for 2022 compared to prior seasons, at least to an extent. That prior sentence may appear as obvious logic, but some established baseball minds do not like to vary their approach from year-to-year. Addressing that issue with measurable data will hopefully address such a problem.
When a team is asking themselves whether they want to proceed with a given action, they first need to know the needed success rate to produce extra runs in the long run. With a Run Expectancy Matrix, this is actually quite simple. Only three boxes from the matrix need to be known: the current run expectancy (variable x), the run expectancy situation if a failure occurred (variable y), and the run expectancy situation if success happened (variable z). From there, the team needs to find the break-even point or variable B. This requires knowing the differences between the variables y and z with x. For in the difference, the teams can expect to know the number of runs they can expect to win or lose depending on the outcome. Once the differences are calculated, the absolute value of the failure difference needs to be divided by the absolute value of the success difference plus the failure difference. This will yield the percentage that the event needs to be successful to neither gain nor lose runs, or the aforementioned break-even point. Having this in verbiage can be somewhat difficult to visually comprehend, hence the underlying formula:
Break-Even Point (B) = |y – x| /(|y – x| + |z – x|)
With teams having the ability to know the break-even point, their projections for their own players can yield somewhat straightforward answers. If a team projects a given player to succeed 70% at a given action, and the break-even point is at 65%, then the team will elect to go forward with the action. In the long term, they will produce more runs by going with the action than not. If the success percentage was 60% then they would not elect to go forward with the given action. Over a large sample size, they would end up losing runs by attempting such an event. This simple equation has been key to assisting teams in making logical decisions, opting for the choices that are probabilistically most likely to succeed compared to what a given leader feels in his gut. Subjectiveness is almost completely taken out of the equation – it provides a framework of pure objective thought. Considering differing break-even points will be the main method in deciding whether to go with certain strategies over the changing run environments.
The sheer size of different game strategy options makes the consideration of all the implications of a differing run environment unfit for an article. In fact, hundreds of pages of Tom Tango’s The Book: Playing the Percentages in Baseball were mainly based on using matrices for different game situations. In order to provide value to the reader, this article will intend to highlight a few key possible differentiations in strategy that should be adapted with the new 2022 run expectancy matrix.
Sacrificing On Purpose
Common conventional knowledge loves the idea of sacrifices, and their ability to move runners over. Common sabermetrics hates the idea of sacrifices, and their ability to cost teams more outs and overall runs. In considering these situations, it is important to differentiate whether the objective was completed or not. When a team commits a sacrifice, they generally have no intention of getting the sacrifice on. It may happen, but not often, and it is not the primary objective. The same line of thinking goes with double plays on sacrifices, which are generally few and far between. Hence, only the intended transaction of a sacrifice play will be considered. Break-Evens will not be considered, as having the only option being a successful sacrifice negates the purpose of such a tool. As one might expect, the sabermetric folks are generally right about this concept. In any notable sacrifice situation during both samples, if done successfully, the offensive team yielded negative runs on average. These situations include moving a man from first to second or second to third, as well as moving men from first and second to second and third (all with 0 or 1 out, of course). But, the run environment still did have an effect.
For every one of the six scenarios considered between the two samples, the lower run environment of 2022 proved to limit the damage of sacrificing compared to the higher environment sample. As shown below, the differences were not by much.
But, they still prove a point. If a given team wants to utilize a sacrifice scenario, for whatever reason, then they would generally lose fewer runs if it were done during a period where these situations produced less runs on average in general. This does provide a somewhat degree of differentiation in strategy, although teams should just avoid purposeful sacrifices in general that do not score a guaranteed run.
Stealing a Base
It is well known in the baseball world that stealing bases has somewhat fallen out of favor with many teams. As they began to realize that a high percentage of success on attempts was necessary to continue the practice, they slowly realized that it no longer made sense to take those bets. Keeping the runner on the bag, in itself, had the majority of the value. The number of stolen bases may be on the downturn, but it is still an existent practice that even the most sabermetrically-inclined teams still utilize. If a team can become aware of the amount of risk and needed success rate, they can proportionally take a chance or two to improve their odds. With run environments changing, the odds have changed. Conventional lore has it that stealing bases is more important as fewer runs are scored, but is there any truth to this?
When a team is asking themselves whether they want to steal in an environment, they need to know the break-even points and their terms of projected success. In highlighting stolen bases, only the main scenarios will be considered: stealing second or third base with one runner on, with either 0 or 1 outs. In using the break-even point consideration, the failure outcome is the run expectancy with an additional out and less a runner, and the successful outcome is the run expectancy if the runner advanced. These are the results of the four main scenarios:
While the run environments did alter the probabilities of needed success for base-stealing, the results were somewhat flippant. The bigger and smaller run environments didn’t lead to a straight pattern between the stolen base break-even points. This somewhat disproves, at least in recent years, those stolen bases were more valuable in a low-run environment. The biggest difference between the two different scenarios in the two samples was a minute 3%, which implied that stealing a base with a man on first and no outs was less valuable in the low run environment versus the higher 2016-2019, 2021 sample.
As no straight correlation is apparent, teams would still likely adopt similar strategies in these different types of run environments. This entails acknowledging that when a man is only on first, a team should at least aim to be successful 75% of the time in stealing second. With a man on second, teams should probably altogether try to avoid stealing with no outs. If one out is in play, the optimized strategy supports a team being successful at least around 70% of the time – moving the runner from second to third in this instance has proven profitable. The run environments may be different, but that does not always mean a team needs to act that much differently.
While probably the least in-depth topic in relation to the effects of run environment among these two samples, it does count as one of the only things that a manager can objectively control the action and outcome simultaneously. With that in mind, a brief noting seemed as if it could add some value. Giving up a base is never pretty, but the level of pain for the defense could differ based on the run environment. Pictured below is the expected run loss from an intentional walk for each base-out state for both the 2022 and 2016-2019, 2021 matrices.
Not including the bases-loaded states (as the outcomes will be the same regardless of the run environment), the average cost of a walk was -0.346 runs in 2022 and -0.332 runs in the other matrix. The different run environments experienced a flip-flop of which experienced the most hurt by a walk in different situations, although the 2022 sample experienced four of the biggest negative differences. Specifically, the outcomes that involved a man going to first and third or having the bases loaded with 0 outs. When these situations occurred, the slight difference in run environments allowed for a small but existent gap of around 0.1 runs to be lost by the defense.
Accordingly, one can argue that when fewer runs are being scored, the detriment of an intentional walk early in an inning can prove to haunt a bit more than the average situation. As getting on base is inherently more valuable when less runs are scored, and the odds of scoring are incredibly higher in those situations… such fits with common logic. The situation would vary based on the skill degree of the hitter, but that complication is veering off from the point of this article. With the average hitter, such a framework should be utilized in similar run matrice years.
As hitting levels change, so should strategy. Each individual year has its own combination of the expected outcomes of certain scenarios, which invites a fresh breath of dynamic thinking. With the 2022 regular season at a close, its low run environment levels allowed for some potentially interesting differentiation in strategy in comparison to other years. Teams faced different levels of pain on sacrifices, different break-even points for advancing bases, and some variation in the costs of intentional walks. These differences make room for informed decisions.
In the past, lots of baseball’s flawed decisions could be owed to the inherent failing thought processes of the human brain. As smart as we can be, our minds tend to latch on to certain strategies that are not always the most optimal. This is especially applicable to baseball By utilizing and examining the run matrices from different years, opting for the conventional mind is no longer necessary – we have articulated reports on the best courses of action, with objective data to justify such conclusions. While this article mainly serves as a preliminary and brief look into analyzing run matrices, it can hopefully lead others to consider the various options of data to consider. Bringing the sport closer to objective thought should be the ultimate goal, and the use of such frameworks as this will lead to that goal.
Photo by Gregory Fisher/Icon Sportswire | Adapted by Doug Carlin (@Bdougals on Twitter)