Pace and Efficiency Part 1: Measuring and Predicting NBA Pace Stats

Different NBA teams play at different paces. Many different NBA pace stats exist, but how can we predict the pace of an upcoming game? Do faster teams dictate the pace? Do slower teams? Does the pace of the game meet somewhere in the middle? In this article we’ll investigate three different ways of estimating the pace of play in a particular matchup. We’ll then analyze the effectiveness and predictive power of these models to see which are better at predicting future pace of play.

The three different hypotheses we will look at are:

  • The faster team dictates the pace of play
  • The slower team dictates the pace of play
  • The pace of play ends up being the average of the two team’s individual paces

This article is the first in a multi-part series studying pace, offensive efficiency, and defensive efficiency in the NBA. Check out the second part in our series on pace and efficiency in the NBA.

Calculating Possessions

Possession in the NBA is a natural concept. Possession changes back and forth team-to-team as the game goes on so conceivably we could just calculate possessions by counting them. However, because of the inexact nature of when possessions start and stop, the NBA (and other outlets like ESPN) actually use a formula from the box score to compute how many possessions a team had. Here are two rhetorical questions to help understand the complications of actually counting possessions:

  • If a team rebounds the ball with one second left and heaves a shot at the end of the quarter, does this – or, rather, should this – count as a possession?
  • If a team shoots, misses and gathers their own offensive rebound, and brings the ball back up top should this be one possession or two?

To avoid these intricacies, the NBA uses the following formula for possessions: Possessions = \frac{FGA + .44FTA-ORB+TOV}{2} where FGA, FTA, ORB, and TOV are the total field goal attempts, free throw attempts, offensive rebounds, and turnovers by both teams.

How can we understand this possession formula?

Possessions end with either shots, free throws, or turnovers. However, if there is a missed shot with an offensive rebound, a single possession can record two field goal attempts. Thus, adding together field goal attempts, turnovers, free throw line trips, and subtracting offensive rebounds can tell us how many possessions a team had. The one wrinkle: free throw line trips can’t be counted from the box score. Sometimes players get two shots, sometimes they get three. Sometimes they get an and-one. By running analysis on previous years, the NBA statisticians have found that multiplying free throw attempts by 0.44 is the best possible estimate of free throw line trips from the raw number of free throws taken.

The calculation of possessions is actually an interesting analytics problem to study on its own. However, for our needs, simply knowing a good estimate – using the above formula – is enough.

Calculating and Predicting NBA Pace Stats

In the NBA, pace stats are usually quoted as ‘possessions per 48 minutes’. Thus, using the formula above for possessions, one can very easily compute a team’s pace after, perhaps, adjusting for overtime games. And, in fact, this is what you find when you see ‘PACE’ stats (for example).

However, I want to go a step further than simply counting the number of possessions that occurred in a game and dividing by 2. I want to be able to predict the pace of a particular game. If the fastest team plays the slowest team, what happens?

Before discussing three models for predicting pace, let me add some context why my might want to be able to predict pace. My eventual goal is to generate a model for predicted Over/Under lines for a game. Two major factors should go in to predicting the total number of points scored in a game. First, we need to predict how many possessions there will be. Second, we would like to know the average points scored on each of those possessions. This article answers the first question. A later article will address offensive and defensive efficiency. A final article will hopefully describe our Over/Under model and measure some accuracy (provided I can locate some historical over/under line data).

Three Models for Predicting NBA Pace Stats

In a previous article, I talked quite a bit about model complexity and how, often, the best models are those that are either ‘explainable’ – meaning based on some hypothesis – or otherwise ‘simple’ in some way.

For us, we will consider three explainable models for predicting the pace of an NBA game each based on a different hypothesis. The three models (or hypotheses) are

  • That the fastest team dictates the pace of the game
  • That the slowest team dictates the pace of the game
  • That the pace is the average of the two team’s individual paces.

We analyzed each of these models by seeing how accurate they were at predicting paces of games in the NBA. To do this, we scraped the box score of every game in 2019 from basketball-reference. The dataset we used can be found in .csv format on our Data Page.

Experiments

The first step in all our experiments was to compute the average pace of each team measured in possessions/48 minutes. Then, using three different methods we went through every game in the season and computed the difference between the predicted number of possessions and the total number of possessions.

The three models I suggested above I will refer to as the ‘max model’, the ‘min model’, and the ‘naïve average model’. The ‘max model’ says the pace of play will be equal to the pace of the faster team. The ‘min model’ says the pace of play will be equal to the pace of the slower team. The ‘naïve average model’ says that the pace of play will be equal to the average of the paces of the two teams. The reason for the adjective naïve here will become clear in the coming sections.

Let me give a quick example to make these models clearer. Suppose Team A plays at 105 possessions per 48 minutes and Team B plays at 98 possessions per 48. Then, if these two teams played each other:

  • The max model would predict the game to be played at a pace of 105 possessions per 48.
  • The min model would predict the game to be played at a pace of 98 possessions per 48.
  • The naïve average model would predict the game to be played at a pace of 101.5 possessions per 48.

It turns out that the naïve average model is most correct! How did we determine this? We used these three models to predict every game’s pace and measured the average error between actual number of possessions and predicted number of possessions. We measured these errors in both L1 (average absolute deviation) and in L2 (root mean square error) to verify that mixing two team’s paces actually gives the best predictive power.

We also included a baseline model for comparison. This ‘constant model’ predicts every game to have the same pace. The driving hypothesis here is that pace is fairly independent of who is playing. Clearly, this is incorrect, but comparing to the ‘constant model’ gives us a frame of reference for comparison. Here is what we found (in the following, poss = possessions).

  • The ‘constant model’ had an average error of 3.97 poss/48 and an RMSE of 5.47 poss/48
  • The ‘max model’ had an average error of 3.57 poss/48 and an RMSE of 4.94 poss/48
  • The ‘min model’ had an average error of 3.57 poss/48 and an RMSE of 5.03 poss/48
  • The ‘naïve average model’ had an average error of 3.34 poss/48 and an RMSE of 4.71 poss/48

Our first conclusion is: Using the average of the two teams’ paces to predict the game’s pace results in an increase in prediction accuracy of about 0.6 possessions/48 over the constant prediction model.

While it seems that the ‘naïve average model’ is the best way to predict a game’s pace, it turns out that we can do even a little bit better with just a bit of calculus. Here is a thought experiment to motivate my point. In 2019, the Atlanta Hawks were the fastest team in the league. Therefore, in every game they played, the other team brought down the pace of play. Therefore, when we compute the Hawks pace of play as ‘106 possession per 48 minutes’, this actually under-rates how fast the Hawks play because this 106 number is always half Hawks possessions and half other, slower team possessions.

Computing Purified Pace Stats

To fix this ‘mixture’ issue, we solve an optimization problem that assigns to each team a pace number that is meant to reflect how fast that team plays independent of their opponents. In particular, if p_i is the pace of play for team i, then we want to find the values of the pace stats so that \sum_{Games}\left(\frac{p_i+p_j}{2} - \text{Num. Poss.}\right)^2 is minimized where p_i, p_j are the two teams involved in the game. In this way, the fact that the Hawks always played teams slower than them is accounted for because the predicted game pace takes into account both their pace and their opponent’s pace. This model will be called the ‘purified average model’ because we use each team’s true pace (which has been ‘purified’ from an observed mixture) and average them to predict the pace of a game.

Using the ‘purified pace model’ the Hawks pace of play estimate increased from 106 to 111 possessions per game. This 111 number more clearly represents how fast the Hawks play without the average being brought down by the effects of opponents playing slower. We will now verify the accuracy of this model by comparing against two of the models from the previous section. For reference, I’ll include two of the prior model accuracies again in the following bullet points. We find:

  • The ‘constant model’ had an average error of 3.97 poss/48 and an RMSE of 5.47 poss/48
  • The ‘naïve average model’ had an average error of 3.34 poss/48 and an RMSE of 4.71 poss/48
  • The ‘purified average model’ had an average error of 3.11 poss/48 and an RMSE of 4.4 poss/48

Notice that this more advanced model has garnered another significant increase in accuracy. To put this into perspective, the naïve average model reduced the prediction error relative to the constant model by 16%. The purified average model reduced the prediction error relative to the naïve average model by a further 7%.

2019 NBA Pace Stats

Shown in the table below is the team pace statistics from the 2018-2019 season. The ‘pace’ column is simply the teams average number of possessions/48 computed in the traditional way. The purified pace column shows the computed paces of every team using the above described method.

TeamPacePurified Pace
Atlanta Hawks106.0110.8
New Orleans Pelicans105.2108.9
Sacramento Kings104.9108.1
Los Angeles Lakers104.9108.0
Milwaukee Bucks104.4108.0
Oklahoma City Thunder104.6107.7
Los Angeles Clippers103.6105.4
Philadelphia 76ers103.2105.0
Washington Wizards103.2105.0
Brooklyn Nets103.1104.4
Phoenix Suns102.5103.0
Golden State Warriors102.1102.8
Minnesota Timberwolves102.3102.7
Utah Jazz102.0102.5
Boston Celtics101.5101.2
New York Knicks101.4101.0
Dallas Mavericks100.899.7
Toronto Raptors100.999.7
Portland Trail Blazers100.799.6
Chicago Bulls100.699.4
Charlotte Hornets100.699.3
Miami Heat100.098.3
Orlando Magic99.697.6
Indiana Pacers99.797.6
San Antonio Spurs99.597.4
Houston Rockets99.797.3
Detroit Pistons99.296.5
Denver Nuggets98.996.2
Cleveland Cavaliers98.394.8
Memphis Grizzlies98.194.0

Takeaways

There are two main points to have learned from this study. The first is that, in the NBA, neither team really dictates the pace on their own. The common (often contrasting) wisdoms of either the faster or the slower team being able to control the game flow are not borne out in the data. In fact, the best way to predict the pace of the game is by averaging the two teams’ paces.

Second, measuring a team’s pace by counting total possessions and normalizing to obtain ‘possessions per 48’ is flawed. The reason this idea is flawed is because the data point one obtains in this way is actually a mixture of a team’s pace and the pace of all their opponents. In order to best represent a team’s pace and to best predict the pace of future games, we should ‘un-mix’ the observed game paces by solving an optimization problem.

So, that concludes our study of predicting pace in NBA games. In the coming follow-up articles, we will study offensive and defensive efficiency of teams in a pace-controlled setting.