Efficiency Part 2: A New NBA Offensive Rating and NBA Defensive Rating System
Boiling everything down, success in the NBA can be directly linked to offensive efficiency and defensive efficiency. We’ll directly study these concepts with a new NBA offensive rating and NBA defensive rating methods. In particular, we’ll measure team’s offensive and defensive prowess in ‘points per possession’. Why pick this metric? Really it’s simple: if on average your team scores more points per possession than you give up, you should be expected to win the game more often than not.
The ultimate goal of this article is to assign numbers to each team that represent ‘offensive rating’ and ‘defensive rating’ in points per possession. Once we have those numbers, though, we’ll be able to derive a few conclusions:
- First, and most obviously, we’ll be able to rank NBA teams by their defensive and offensive ratings
- The separate offensive and defensive ratings can be combined to create an ‘overall’ team rating (which may be used to complement our NBA Power Rankings with Bayes Ensemble). More on this in the future.
- Two teams’ component offensive and defensive ratings can be combined, along with a prediction of a game’s pace, to predict over/under lines in Vegas.
This final bullet point was the original motivation for this series of articles on NBA efficiency. Part 1 determined effective ways to predict the pace of an upcoming game. This second part will use the offensive rating and defensive rating for each team in a matchup to predict ‘points per possession’. Combining these two ideas naturally leads to a prediction of points for each team and, therefore, a hopefully accurate prediction of the total points scored in a given matchup.
Existing Methods for NBA Offensive Rating and NBA Defensive Rating
Measuring efficiency on a per-possession basis is a very natural idea and is done by many different analysts and in many different forms. KenPom, a website created by Ken Pomeroy, is traditionally respected as ‘the source’ for rankings of offensive efficiency, defensive efficiency, strength of schedule, and overall team quality in college basketball.
As is typical in many of my articles, ESPN already measures the object I wish to measure. However, as is also typical when I study a topic, I want to:
- Explain the existing techniques and methodologies
- Discuss why, and in which ways, the existing techniques are deficient
- Propose an alternative model that is supported by observation or intuition that seeks to address these deficiencies, and
- Show with data that my proposed method works exceedingly well.
Here is the ESPN page with advanced stats from the 2018-2019 season (which I will use as a case study). Notice two columns rank teams according to offensive rating / offensive efficiency and defensive rating / defensive efficiency. These stats are ‘the obvious’ way of measuring efficiency stats. Offensive efficiency is computed by dividing points scored by offensive possessions and multiplying by 100 while defensive efficiency is points allowed per 100 defensive stands.
The First Deficiency of Efficiency Stats
If you remember back to our prior article about pace, possessions are a difficult thing to measure in the NBA. Typically (as in, the NBA’s published stats do this) possessions are calculate by adding up total shots taken, turnovers, and free throw line trips and subtracting off offensive rebounds.
The reason this is correct is because every possession ends in either a shot, a turnover, or free throw line trip unless there is an offensive rebound. If there is an offensive rebound the possession continues. Therefore, subtracting off the offensive rebounds gives us a reasonably accurate measure of a team’s possessions.
The ESPN efficiency metric divides points by ‘shots + turnovers + FT line trips’. I am unsure what this statistic is but it is not points per possession. Failing to subtract off offensive rebounds results in strange scenarios where a team can have, for instance, 2 or 3 or more straight possessions by virtue of rebounding their own misses. In this setting, offensive efficiency measured in ‘points per possession’ isn’t as important to measure because two teams can have a different number of possessions in a game.
Let me give a concrete example of why this is actually important. My argument is that doing this penalizes teams with high offensive rebounding rates (such as, for instance, the 2020-2021 Pelicans) by under-reporting how much they score per possession. Here is the example (in an extreme case to highlight my point).
Suppose we had a team that just happened to, on every possession, miss their first shot, gain the offensive rebound, and scored 2 points on their next attempt. ESPN’s statistic would report this team as having an offensive efficiency of exactly 1. This team would have ‘the worst offense in the league’ according to ESPN’s offensive rating. On the contrary, though, this team would be literally unbeatable because they score every time they have the ball.
The Second Deficiency of Efficiency Stats
While the first deficiency is subtle and perhaps easily fixable, the second (and third) are not fixable by simply ‘accounting for offensive rebounds’. The second efficiency deficiency I would like to point out is this: the simple formula doesn’t take into account opponents strengths. We’ll see in a few sections that in 18-19, the Milwaukee Bucks had the best defensive rating. If my favorite team played the Bucks every single game, their points scored statistic would be dramatically lower than it would had they played a league average team every game. Simply reporting points/possessions ignores the critically important context of who those points were scored against.
My method of computing offensive rating and defensive rating is built only on observed data and points scored per possession relative to the quality of opponent. This effect will be even more dramatic when we eventually extend our results to college basketball where the quality disparity is significantly larger.
The Third Deficiency of Efficiency Stats
There are two main branches of statistics: descriptive and inferential. Descriptive statistics is simply summarizing, visualizing, and describing past observed data to tell you what happened in the past. Inferential statistics is using statistical theory to take what you’ve seen in the past to predict what happens in the future. It’s the difference between watching SportsCenter summarize a game versus listening to a podcast speculating on who wins the finals. To me, prediction is much more interesting and falls much more in my wheelhouse because generally it requires more math knowledge than basketball knowledge.
To that end, when I study ‘efficiency’ I am not just interested in measuring ‘how many points this team scored per possession in the past’. I want to predict ‘how many points this team will score against a particular opponent per possession in an upcoming game’. This is the type of data that would be valuable to betters, for instance. I claim that it is not obvious how to combine two opponents’ component offensive and defensive ratings to generate a prediction on how many points per possession would be scored if these two teams play. This is the third deficiency of this traditional measure of offensive rating and defensive rating.
If a team in the NBA has an offensive rating of 1.14 points per possession and their opponent has a defensive rating of 1.08 points per possession, how can we most accurately predict what will happen if they play each other? It turns out that this question is quite subtle. The initial guess might be ‘1.11 points per possession’, simply averaging the two values. However, let me explain why that is not a good estimate in general.
The league average in 18-19 was about 1.08 points per possession (ppp). Suppose the Warriors had an offensive rating of 1.15 ppp and the Cavaliers had a defensive rating of 1.15 ppp. This means that the Warriors score about 0.07 more ppp than league-average while the Cavaliers give up about 0.07 more ppp than league-average. Combining the Warriors being .07 ppp better on offense than average and the Cavs being .07 ppp worse than average on defense, I actually would expect that in this matchup the Warriors offense would be .07+.07=.14 ppp more efficient than league average.
That means our model needs to somehow combine 1.15 and 1.08 to get 1.22. Clearly, the simple ‘averaging the two ratings’ model is incorrect.
Our Proposed NBA Offensive Rating and NBA Defensive Rating Model
Turning the discussion from above into an actual math problem, I propose the following model for predicting points per possession in an upcoming matchup. If team A has offensive rating oA and their opponent, team B, has defensive rating dB, then on team A’s offensive possessions, they should be expected to score \begin{aligned} o_A + d_B - \text{League Average} \end{aligned} points per possession. We can verify this formula by noticing that it measures things relative to league average. In the following, let L denote the league average points per possession rate. Notice:
- o_A – L is ‘how many more ppp team A’s offense scores than a league average offense’.
- d_B-L is ‘how many more ppp team B’s defense allows than a league average defense’.
- Adding these two terms together tells us what we should expect on team A’s offensive possessions above or below league average.
In particular, it is easy to verify the relation \begin{aligned}(o_A – L)+(d_b-L)+L =o_A+d_B-L \end{aligned} so that our proposed prediction is precisely the model we want to use.
Now, how does one determine the values o_A, d_B for every team in the NBA? As I tend to do in most of my analysis, we pick the values for these parameters that best explain the observed scoring rates we have seen from already played games. That is, we pick the parameters that best fit the available data. As usual, I solve a least squares minimization problem that minimizes the squared difference between the ‘predicted ppp’ and ‘observed ppp’ over all games already played.
Results and Accuracy
For the entire 2018-2019 season, we computed the offensive rating and defensive rating for every team. Then, we measured how accurately our model predicts the offensive efficiency and defensive efficiency of every matchup. Over the course of the season, we found that our model achieved an average prediction error of about 7.4% when measuring the average absolute deviation in predicted points per possession and observed points per possession.
NOTE: We compared our ratings to the ESPN ratings and found that our average deviation was smaller. This will be presented in the coming article looking at our model’s performance. We point out that at the end of the season, the models seem to converge as everyone’s strength of schedule ‘evens out’. However, in the beginning and middle of the season, our model is significantly more accurate. Stay tuned for more.
Inferring Team Ratings
One of the suggested benefits to using this technique was that we could actually ‘compile’ a team rating from the component offensive rating and defensive rating. Many sources (again, KenPom) infer a team rating by using a ‘Pythagorean Expectation’ to combine offense and defense. I am willing to go on record saying that using the Pythagorean technique to combine offense and defense doesn’t make sense – at least in this setting.
I value ‘interpretable’ statistics very highly. The average sports fan doesn’t want to read arbitrary numbers, they want to know what the numbers mean. While I cannot tell you with a formula where my statistics come from (because they are computed with matrix inverses), I can tell you precisely what my numbers mean. The Pythagorean combination of offense and defense generates a meaningless number that is just a ‘nice’ way to combine separate statistics into a single number.
To use Pythagorean expectation in different sports, one first needs to determine the ‘optimal exponent’ for that sport. To me, this fact sets off alarm bells. If something makes sense, it shouldn’t have to be so fine tuned for one sport or another. There exists theoretical justification for the Pythagorean expectation, but the assumptions of that analysis are dubious at best.
For us, team rating can be very easily created and interpreted. Ready? Given the offensive rating and defensive rating of a team from the table below, you can compute their team rating by subtracting their defense from their offensive rating. That’s it. What does this number mean? It is how many points better our team is than a league-average opponent over one offensive and one defensive possession.
A team with a large offensive rating and a large defensive rating has a good offense and a bad defense – their rating may be near 0. A team with a large offensive rating and a small defensive rating will have a positive team rating – this makes sense because the team has a good offense and a bad defense! You can go through the other combinations in just the same way.
2018-2019 NBA Offensive Rating, Defensive Rating, and Team Rating
The correct interpretation of these numbers is ‘expected points per possession scored/allowed against a league average opponent’. The team rating interpretation is ‘how many points better than league average is this team over the course of one offensive and defensive possession’.
Team | Pace | Off Rtg | Def Rtg | Overall | |
---|---|---|---|---|---|
Milwaukee Bucks | 108.6 | 1.113 | 1.032 | 0.081 | |
Golden State Warriors | 103.4 | 1.136 | 1.071 | 0.065 | |
Toronto Raptors | 101.0 | 1.106 | 1.043 | 0.063 | |
Houston Rockets | 98.3 | 1.128 | 1.075 | 0.053 | |
Utah Jazz | 102.6 | 1.078 | 1.031 | 0.047 | |
Denver Nuggets | 96.6 | 1.108 | 1.064 | 0.044 | |
Portland Trail Blazers | 101.5 | 1.120 | 1.078 | 0.043 | |
Boston Celtics | 101.5 | 1.091 | 1.052 | 0.039 | |
Oklahoma City Thunder | 108.7 | 1.073 | 1.042 | 0.031 | |
Philadelphia 76ers | 105.3 | 1.100 | 1.074 | 0.026 | |
Indiana Pacers | 97.1 | 1.069 | 1.044 | 0.025 | |
San Antonio Spurs | 97.9 | 1.107 | 1.088 | 0.019 | |
Los Angeles Clippers | 106.1 | 1.103 | 1.095 | 0.008 | |
Orlando Magic | 97.8 | 1.059 | 1.062 | -0.003 | |
Miami Heat | 98.0 | 1.049 | 1.052 | -0.004 | |
Brooklyn Nets | 106.6 | 1.068 | 1.076 | -0.007 | |
Sacramento Kings | 107.9 | 1.084 | 1.090 | -0.007 | |
Dallas Mavericks | 100.0 | 1.073 | 1.080 | -0.008 | |
New Orleans Pelicans | 108.9 | 1.090 | 1.100 | -0.010 | |
Minnesota Timberwolves | 103.6 | 1.089 | 1.099 | -0.010 | |
Detroit Pistons | 97.6 | 1.065 | 1.077 | -0.012 | |
Los Angeles Lakers | 108.4 | 1.056 | 1.069 | -0.012 | |
Charlotte Hornets | 100.2 | 1.089 | 1.102 | -0.013 | |
Memphis Grizzlies | 95.3 | 1.040 | 1.061 | -0.021 | |
Washington Wizards | 106.9 | 1.088 | 1.119 | -0.032 | |
Atlanta Hawks | 112.0 | 1.057 | 1.113 | -0.056 | |
Chicago Bulls | 100.9 | 1.029 | 1.111 | -0.082 | |
Phoenix Suns | 104.4 | 1.036 | 1.120 | -0.083 | |
New York Knicks | 101.3 | 1.023 | 1.111 | -0.088 | |
Cleveland Cavaliers | 94.7 | 1.058 | 1.154 | -0.096 |
Looking Forward
Coming soon, we’ll put all of this together to create and present our model for predicting NBA totals. In particular, part 3 of this series will measure accuracy in prediction against Vegas. Spoiler alert: we win more often than not.