NEW Data Jocks NBA Model: The Sparse Impacts Model

The Sparse Impacts Model (SIM) is the second iteration of The Data Jocks’ NBA model. Some NBA models assign ratings to teams. An entirely different type of NBA model assigns ratings to players and infers team ratings as the sum of the players skills.

The sparse impacts model fits somewhere in between. It gives individual ratings to the best players. It also gives team ratings to capture how good everyone else is combined. In this way it sits comfortably on the spectrum between team rating systems and player rating systems. The point of all this is to build a more accurate NBA model with understandable results that solves the bias-variance tradeoff problem described here below.

In this article we’re going to go into depth explaining the sparse impacts NBA model and how it ended up as the second iteration of The Data Jocks’ NBA Model. Most importantly, we’ll talk about how this model compares to existing models (like Sagarin, box plus minus, and the now-defunct 538 model) and addresses some of the problems in each.

To receive email updates when new articles are posted, use the subscription form below!

Sparse Impacts Model NBA Model

Two types of NBA Models

All NBA models are alike in that they assign ratings to players or teams. We use these ratings to figure our who is likely to win a game. A team with a higher rating is better than their opponent. A player with a really high rating is better than most other guys on the court.

Some models like Sagarin ratings or the original Data Jocks Ensemble Ratings assign ratings to teams as a whole. They do this by looking at the outcomes of games and figuring out the ratings that best match the data. Then by comparing two teams’ ratings, we can predict which teams will win games.

Models that assign ratings to players – like box plus minus or the defunct 528 CARMELO model – can be used to predict which teams will win games too. Adding up the ratings of individual players on a team can give you an overall team rating. A team with a +5, +3, a 0, a -1, and a -2 player should all combine to make a team that is a net +5.

I like to think of these as two flavors of the same thing. At the end of the day, the end result is a rating for a team.

While the endpoint is the same, there are some important differences between these two. More specifically, there are two fundamental factors at trade when building a model using these two methods: explanatory power and sample size.

Explanatory Power

When we estimate individual player quality instead of overall team quality, our model has higher explanatory power. This means that our model is better at predicting outcomes of games. Let’s see why with an example below.

What happens if Kawhi Leonard is sitting out for a random home game in December? A model based on rating players and assembling team ratings as the sum of the individual contributions can handle this. It can predict that the Clippers should be worse than normal on that night because they’re missing their star. The model which only assigns team ratings has no way to change its predicted outcome based on the knowledge that Kawhi is sitting. This leads to less accurate results.

All things held equal, a model which takes into account the strengths of different players will perform better. On paper this seems to indicate that a player-rating-based model should perform better. However, there is one big problem.

To read more about this, take a look at articles explaining the bias-variance tradeoff.

Sample Size and Multicollinearity

When forming player ratings instead of team ratings, the stats problem becomes much harder. This is because the more ratings we have to assign, the more complicated things get. One way to think of this is in terms of sample size. From your high school stats class, we know that bigger sample sizes lead to better estimates for unknown things.

In our setting where we are estimating many ratings at a time, a useful heuristic is to use the number of games played divided by the number of ratings to compute as a proxy for sample size. For example, 10 samples to assign one rating is roughly equivalent to using 20 samples to assign 2 ratings. When we have to assign ratings to every individual player, this decreases the sample size and causes our results to be skewed. Again let’s try an example.

If we want to estimate how good Kawhi Leonard is, one way is to look at how good the Clippers are when he plays versus how good they are when he doesn’t. If he plays 50 games and rests 30 games, suddenly our sample size has decreased from 80 to roughly half that. Imagine how much worse this problem gets if we want to compute a rating for Paul George and the other Clippers as well.

Assigning a rating to each player often leads to some funky results because of the small sample sizes. Even worse, sometimes we can’t separate the effects of two players because they play a crazy amount of their minutes together (they play in a platoon). To read more about this, check out our last article on multicollinearity.

The Tradeoff Between Models

The above two points show how player-rating models and team-rating models differ. Given an infinite amount of data, the player rating system is undoubtedly better because it has more explanatory power. The model is more granular in scope.

However, because we don’t have an infinite amount of data, it can be very difficult to assign meaningful ratings to every player who ever plays. The team-rating model is much simpler and has far fewer parameters to estimate.

The result is that the team-rating model provides meaningful results in many fewer games. The model learns quicker. Because the NBA changes so fast, learning speed is really important. If you take 3 or 4 seasons to learn how good a player is, then your model is useless. In that same amount of time, the player may have aged into their prime or gone over the hump into the tail end of their career.

Because of these competing factors, our model proposes a middle ground which balances explanatory power and sample size. Before introducing our model a bit more in depth, we want to talk about one other statistical consideration.

Priors, Sample Size, and Real Plus Minus

Assigning ratings to every player runs into a sample size problem. However, some stats like ESPN’s real plus minus do assign ratings to each and every player! How does it work? There are two tricks that go into making this work, and you can read more about RPM here.

First, real plus minus uses more than just the game score. It uses the full box score stats to figure out who the best players are. It uses raw plus minus, points, rebounds, assists, shooting percentage, etc. More information feeding into the stat means a larger sample size!

A lot of these stats also use the idea of a prior. A prior is exactly what it sounds like. A “prior” is a guess about the value of someone’s rating prior to gathering any data at all. In the basketball world, using a prior means having a guess about how good a player is before we see any data at all. Typically this is done by using data from past years to get a head start on the current year.

For example, even before seeing a single game in the 2023-2024 season, we think it is fairly likely that Nikola Jokic is a top 5 player. If we see a handful of games of top-level performance, then we can be reasonably certain that this guy is actually good. If we had no prior, then only a handful of games could be seen as an outlier and might not mean as much.

Priors are very helpful to bootstrap your model and counteract the small sample size problem. Our model makes use of them, but we’ll save the description of that for a later date.

A New NBA Model: The Sparse Impacts Model

Our new NBA model is called the Sparse Impacts Model. The idea is to assign ratings to only a sparse subset of the players as well as to each team. We start by figuring out the ratings for only the best players as these are the guys who provide the most impact. Then, we will combine the rest of the players into an overall “rest of the team” score.

For example, maybe our model determines that Kawhi Leonard is a +8 but the rest of the Clippers combine to a -2. Then, when Kawhi is on the court, the Clippers are a net +6 (his +8 contribution combined with a -2 from the rest of the team). When Kawhi is out, the Clippers play at their remaining -2 level.

How does this solve the problems discussed with the existing models above? Remember the two problems were sample size and explainability. Let’s start with explainability.

Because our model gives ratings to the most important players, we’re able to take into account rest days and injuries to better predict games. We get a better picture of the state of the NBA.

But remember that we don’t go all the way down the rabbit hole and rate every single player. By restricting our ratings to 30, or 60, or however many players we determine to be optimal, we keep the sample size relatively large.

Ideally, this mix of rating only the best players and combining everyone else into “rest of team” ratings gives us the best of both worlds. We should get good predictive ability without needing a ton of data. We should get a model that reacts quickly to changes or injuries. Most importantly, we should get a model that predicts winners of games very accurately.

Next Steps

The purpose of this page is to explain the rationale behind our newest model. In the next one, we’ll look at how our model performs compared to the others we talked about. Everything sounds nice in theory, but what happens when we apply it to real data?

To receive email updates when new articles are posted, use the subscription form below!