Baseball Alphabet Soup: What is wRAA and What is wOBA?
50 years ago, baseball got mathematical when Bill James started the Society for American Baseball Research and the sabermetrics revolution began. 15 years ago, Tom Tango published “The Book”, complete with ideas including the run expectancy matrix and the statistics wRAA and wOBA, and sabermetrics was recast into a more modern light. Ever since then, though, baseball fans have been forced to wade through the mire of acronyms – wRAA, wOBA, OPS, OBP, wRC, WAR, and on, and on – just to understand the game. For the old-timers, the baseball purists, the “Fathers in Law”, this baseball alphabet soup is ruining the game by boiling everything down to an equation. For others, myself included, understanding the point of these statistics enriches understanding of baseball and makes it that much more entertaining.
Today I want to tackle two closely related and very interesting statistics: wRAA and wOBA. wRAA stands for weighted runs above average while wOBA is weighted on base average. Both of these statistics are meant to help us understand how good a hitter is. In a sense, these numbers try to condense the traditional triple-slash-line summary of a hitter’s abilities into just one number.
Not all of the alphabet soup statistics are good stats, in my opinion. Some are just too far-fetched and make things too complicated. wOBA and, especially, wRAA are excellent statistics which measure the quality of hitters in systematic, interpretable ways. Understanding why I think these statistics are good starts by clearly defining the thing they are trying to measure. So, let’s start there and talk about how some other older statistics tackling the same problem fall short.
The Goal of wOBA and wRAA
The goal is simple: which hitters provide the most value to their team at the plate? Like most questions of this type, we can get a rough estimate very easily, but refining that estimate to be more accurate requires the fancy stats like wOBA and wRAA.
Two of the simplest statistics to study offensive value are batting average and runs batted in. In the simplest sense, players with higher batting averages get hits more often and therefore should contribute more value to their team’s offense. One could also argue that RBI is a good measure of value contribution – batting runs in means that you helped runs be scored.
However, using these basic statistics miss a lot of the nuance of how runs are scored in baseball. I’ve already written at length about the problems with using runs batted in. In short, RBI is too context-dependent and incorrectly assigns all the credit of runs being scored to he who bats the run in when half the work is done by the teammate getting on base before the run producing at bat.
On the other hand, batting average doesn’t differentiate between types of hits. Indisputably, home runs are more valuable than singles. Similarly, triples are more valuable than doubles and singles are more valuable than walks. However, none of this “value” is captured in the batting average calculation. These differences are also missed in computing on base percentage.
OK, then, the next logical step is using slugging percentage. For slugging percentage, you get one point for a single, two four a double, three for a triple, and four for a home run. In this way, slugging percentage is closer to capturing the overall quality of a hitter by giving more value to better outcomes.
However, there is no evidence to support the idea that a home run is worth four times as much as a single and twice as much as a double. In fact, I would rather my team hit four straight singles than a single home run. Instinct tells us that home runs are worth more than singles but probably less than four times more.
The goal of wOBA and wRAA is to figure out exactly the relative value of each outcome at the plate so that we can assign a single, meaningful number to a player’s offensive contributions. To begin with, we need to talk about one of my favorite baseball tools: the run expectancy matrix.
The Run Expectancy Matrix
The run expectancy matrix is a tool we’ll use to help determine the relative value of the different types of hits. I’ve talked before about the run expectancy matrix and its use for decision making. In this article, we’ll show you how to use the run expectancy matrix and a related state frequency matrix to determine the relative value of singles, doubles, triples, home runs, walks, and being hit by a pitch. Let’s start by refreshing ourselves what the run expectancy matrix does.
It is helpful to characterize the setting of an at-bat by knowing which bases have runners and how many outs there are. The combination of these two pieces of information is often called a base-out state. An example of a base-out state is “1 out with a runner on first”. Base-out states tell you what kind of environment the hitter comes to bat in. Some environments are more likely to allow for RBI than others.
For us, the important application of base-out states is the run expectancy matrix. This object is simply a table which records the average number of runs a team score through the end of the current inning depending on the current base-out state. For example, if the bases are loaded and there are no outs, then on average your team will score about 2.3 more runs before the end of the current inning. Borrowing the following data from Tom Tango himself, the run expectancy matrix over the last few years is shown below.
Aside from just being interesting, this object helps us determine the value of an event by the change in expected runs generated by transitioning from one state to another. As a simple example, suppose the bases are empty with no outs and you record a strikeout. Before your at bat, the run expectancy was 0.481. After the at bat, it was 0.254. Therefore, your team, lost 0.227 runs of expected production due to your strikeout.
This number is extremely valuable because it lets us know how much credit individual players get for the overall team’s performance.
Looking closer at the data, though, if there were 2 outs and you struck out with nobody on base, you only lost your team 0.098 runs of expected value. That means the same event can have different values based on factors outside a player’s control. If we want a context-independent measure of player quality we need to eliminate this dependency. We need to add another tool.
Base-Out State Frequency Matrix
Let’s not lose sight of our end goal: we want to determine the relative values of different types of hits in order to determine how good a hitter is. In the last section we saw a way to place value on a specific outcome given a specific base-out state. Now, we want to get rid of the context so that a player’s value depends only on their own achievements and not on external factors like teammate quality or spot in the batting order.
Using the run expectancy matrix in the last section helped us value an event based on how that event causes transitions from one base-out state to another. Now, averaging the value of an event over the relative frequency of each base-out state lets us compute the context-independent value of an at-bat outcome. Below, also from Tom Tango’s website, is the relative frequency of each base-out state.
Let’s work a more specific example to compute an estimate of the value of a home run. The table below shows the value added to your team by a home run in each base-out state.
For example, the value in the bottom right corner is computed by adding the runs scored (4) and subtracting the runs lost by transition from the “bases loaded, 2 outs” state to the “empty bases, 2 out state”. This transition is a 0.752 expected runs state to a 0.098 expected runs state which represents a loss of 0.654 runs in expected value. Thus, the “runs added” due to the play is 4 -= 0.654=3.346.
Now, to estimate the actual value of a home run, we multiply the values in this table with the probability of being in a given state from the previous table to average over the likelihood of each base-out state. This calculation shows that a home run will add on average about 1.4 runs to your team’s expected score. This number is averaged over all possible base-out states. Repeating this process will let us determine the expected runs added by each individual at-bat outcome. Now, we can finally talk about wRAA and wOBA.
What are wOBA and wRAA?
This is where my description of these two statistics differs from the conventional explanation. Usually, wOBA is described as on base percentage re-weighted using the expected runs added for the particular type of hit as we described above, then normalized so that it is on the same scale (so that it looks like) regular on base percentage. Then, wRAA is a rescaled version of wOBA. I don’t think this description is particularly enlightening, so I’ll try starting with wRAA and deriving wOBA from it.
The table below shows the expected runs added due to the outcome of an at-bat averaged over all base-out states as computed in the last few sections.
The wRAA formula then is very easy to grab from this table. In fact, for a player, just count the number of occurrences in the left column, multiply that number by the weight in the right column, and add it all up. The only thing to be careful about is that wRAA and wOBA only count un-intentional walks (uBB). More on that later.
To me, viewing wRAA in this way is much easier than thinking about it as a “shifted, rescaled version of wOBA”. The correct interpretation of this statistic is “how many runs this player contributed to his team’s total above or below what an average player would contribute independent of context”. Putting the entire formula together gives \displaystyle wRAA = 0.55\cdot HBP + 0.57\cdot BB + 0.70\cdot 1B + 2B + 1.27\cdot 3B + 1.65\cdot HR - 0.26\cdot Outs
Or, graphically, we can look at the wRAA formula via the flowchart below.
Viewing wRAA in this way reveals why it is named weighted runs above average in the first place. It is literally the runs added above an average player.
What about wOBA, though? wOBA is built from the same basic ideas but is shifted around so that the numbers live on a scale comparable to regular old on-base percentage (OBP) That is, wOBA encapsulates the same information as wRAA but is moved around and rescaled so that
- League-average wOBA and league-average OBP are the same (they have the same mean)
- The minimum value is 0
The point of all this is that people are used to seeing OBP numbers and interpreting them on a scale they are familiar with. People are not familiar with the wRAA scale. Is wRAA=10 good? Is 100 even possible? Sometimes it is helpful to understand numbers in a context we are familiar with.
So, to translate from wRAA to wOBA, three steps are taken.
- Add the “outs” coefficient to each other coefficient. That is, add 0.26 to each of the multiplying factors. This takes care of the “minimum being 0”.
- Divide by “Plate appearances” (excluding intentional walks) so that it is a rate-statistic and the range of wOBA is similar to the range of OBP.
- Multiply all the coefficients by what is called the wOBA scale factor which takes care of making sure that wOBA and OBP have the same average value. For example, in 2017 the wOBA scale factor was about 1.3.
Conclusions
That’s it. wRAA and wOBA are the same metric in a statistical sense because they are just deterministic functions of one another. If players have the same wRAA, they must have the same wOBA. Personally, I prefer using statistics like wRAA over wOBA because the numbers have a much better interpretation.
For example, if someone has a wOBA 0.1 larger than someone else, I can’t tell you what that 0.1 represents cleanly. However, if someone has a wRAA 10 larger than someone else, that means that on offense that player contributed 10 more runs than the other player over the course of their seasons. But, I understand why both exist because some folks like seeing things on a familiar scale. I just like interpretable stats.
There are two further curiosities. First, wOBA doesn’t consider any intentional walks in its calculation. One argument for this is that an intentional walk is usually a product of the base-out state combined with the score. It doesn’t measure anything about the hitter’s skill because the hitter doesn’t have to do anything. So, it is excluded in this measure of hitter quality.
Second, there is a difference in the coefficients for being hit by a pitch and an intentional walk. Functionally, these things are identical in their impact on the game. Therefore, they should have the same expected runs added. But we see that is not the case. I suspect what is actually going on is that walks – even non-intentional walks – are more likely in situations where the base-out state can easily absorb an extra runner. Even if you don’t throw an intentional walk, you can pitch around a player which will make a walk more or less likely. This effect would create different coefficients for being hit by a pitch and being walked. This also means that the MLB doesn’t use the base-out state frequency matrix when computing the wOBA coefficients.
If you liked this article explaining advanced sports statistics, check out some others we have produced:
To receiver email updates when new articles are published, please subscribe by email below!
One Reply to “Baseball Alphabet Soup: What is wRAA and What is wOBA?”
Comments are closed.