Implied Vegas Ratings

It feels like no matter how hard anyone tries, nobody can beat Vegas ratings. No matter what model you look at or which year you look at, Vegas data is always near the top of lists of the most accurate models.

There are a lot of reasons for this. First, Vegas is incentivized to make accurate models so they make the most money. They routinely hire the best data scientists and mathematicians to ensure their models are accurate.

In addition to this, Vegas models adjust to the public’s opinion. Even if the average sports fan is not the brightest, if you average the opinions of a lot of sports fans you get pretty accurate results. This is an example of ensemble learning in sports.

What Vegas spread don’t tell you, though, is overall team ratings or power rankings. They just tell you the spread, how much better one team is than another. We’re interested in using Vegas spreads to infer overall team rankings and ratings. We call this implied Vegas ratings

This article looks at exactly this problem. Can we combine spread data to compute overall team ratings? What is the best way to do this? How accurate is this technique compared to other computerized sports rating systems?

Intuition for Implied Vegas Ratings

I’ll describe the process with an example. Hopefully this explains how everything works in a simple to understand way! Lets pretend we’re talking about football for now.

Suppose in week 1, Vegas has team A 4 points better than team B and team C 3 points better than team D. This can be summarized with the equations A \approx B+4 and C \approx D+3 .

Notice that this doesn’t tell us about how good A and B are relative to C and D. We have no notion of whether A is way better than C or about the same or much worse. How do we go from here?

Well, luckily in week 2 the matchups switch! Now suppose that the Vegas line are that team A is 5 points better than team C. We also see that team B is favored over team D by 4. We get two new equations: A \approx C+5 and B \approx D+4 .

Now we have four equations and four unknown variables, this should be enough to solve!

Team

Rating

A

4

B

0

C

-1

D

-4

You can check for yourself, but these ratings satisfy each of the four equations we wrote above!

This is really the whole idea, but the math is a bit more interesting than meets the eye. In the next few subsections we’re going to highlight a few things you may have missed.

Ratings are Relative

You can verify for yourself that the following ratings are also a solution to each of the four equations we wrote down using Vegas lines:

Team

Rating

A

7

B

3

C

2

D

-1

So what gives? How can we have two different solutions to the same set of equations? The problem is that all of the data we get is in the form of comparisons. Only the differences between two team’s ratings matter. Adding the same amount to every team’s rating doesn’t change the spread.

Notice that this second table is the first table with every team’s rating increased by three points. This changes the teams’ overall ratings, but it won’t change any of the spreads. The overall rating doesn’t matter, only differences do.

No matter what we do, our system of equations will always have infinitely many solutions because of this fact. But, it isn’t a problem because all we care about is comparisons between teams anyway.

How Many Weeks of Data Do We Need?

In the above example, we were able to figure out the teams’ rankings after just two weeks. How many weeks will it take in general?

Luckily we can use our high school algebra again. If a league has N teams, then we have N ratings to figure out. That means we need N equations to solve for the N ratings. In reality, we actually know that our system of equations is not solvable because of the relativistic problem of the last section, so we only need N-1 equations.

If all the teams are paired up against each other, we get \frac{N}{2} games, each corresponding to an equation. So, in general, after two weeks we should have enough! This is true no matter how big the league is.

There is one caveat, though. This works only when teams play different opponents in consecutive weeks. In the NFL, this is no problem. In baseball, we actually have to wait for the next series if we want to use this approach. This is because each game of the series gives us the same equation multiple times.

Home Field Advantage

The spreads that Vegas provides take into account home field advantage. In the NFL (and, coincidentally, the NBA too) home field advantage is worth about 3 points. That means that if the Vegas spread says team A is 4 points better than team B and team A is the home team, then A is only 1 point better than B on a neutral field. On team B’s home turf, they would even be favored by about 2 points.

When we solve our system of equations, we will take this into account. The easiest way to do this is to introduce a new variable “H” which represents home field advantage.

If we do this, our system of equations will change. For example, if team A is the home team and is favored by 4 points, then we can write A + H= B+4 . This equation represents “team A plus the home field advantage is four points better than B”.

Then, we just solve for the home field advantage in addition to every team’s rating.

Lines aren’t Perfect

The last thing we have to take into account is that Vegas lines are imperfect. In the first example we gave, the lines matched up so that all of the equations were solved perfectly. But Vegas lines won’t always do that. It won’t always be possible to solve for every team’s rating perfectly.

The mathematical way to talk about this is to say that the system of equations we’re solving will be overdetermined. There won’t be a solution.

However, what we typically do in this case is come up with the solution so that the equations are solved as closely as possible. This can be done using many methods, including the Moore-Penrose pseudoinverse.

Testing our Theory

We looked at the 2022 NFL season to see how this method does at generating rankings. To determine if this is a good way of figuring out how good teams are, we need to see how accurate these rankings are at predicting the outcomes of games.

The table below shows some of the results. We compare the implied Vegas ratings to the accuracy of the opening line. We also compare against a model that solves a similar system of equations but instead uses actual margin of victory instead of predicted margin of victory.

Model

% Games Predicted

RMSE in Predicting Margin of victory

Opening Line

66.0%

11.5

Implied Vegas

63.4%

12.2

Simple Margin

61.6%

12.7

First thing is first, we notice that this model does better than using actual game scores. In both % of games correctly predicted and the error in predicting the actual margin of victory, the implied Vegas ratings do pretty well.

But…they don’t do as well as they should. The Vegas opening line is the best of all these models at predicting winners. Our implied Vegas ratings model doesn’t do as well as the raw Vegas data. Something gives.

Improving the Model

The problem is that data from the beginning of the season isn’t always relevant to the end of a season. Sometimes teams get way better from week 1 to week 16. Sometimes a quarterback gets hurt and the team’s overall quality decreases dramatically in the span of a few seconds.

The best models should emphasize more recent data more heavily. There are a bunch of different ways to do this.

The first is to weight more recent games more heavily. Maybe last week’s game is worth two times more than the first game of the season. Maybe it is worth four times more. It is more an art than a science, and we should try lots of different options to see what works best. This will be called the “Weighted Implied Vegas Ratings”.

The second is to totally discard older games. In the introduction to this article, we mentioned that full rankings could be computed in as little as two weeks. Discarding data from three weeks back or further will ensure that we only use the most recent data in computing our rankings. This will be called the “Windowed Implied Vegas Ratings”.

We tested these theories to see if they improved our predictive accuracy.

Model

% Games Predicted

RMSE in Predicting Margin of victory

Opening Line

66.0%

11.5

Implied Vegas Ratings

63.4%

12.2

Weighted Implied Vegas Ratings

64.9%

12.1

Windowed Implied Vegas Ratings

66.0%

12.1

Notice that doing this dramatically improves performance of our model. The windowed implied Vegas ratings model is as accurate as the opening line in predicting the winners of games. It is only a bit worse than the opening line at predicting the margin of victory.

Conclusions

The ideas presented in this article allow us to build power rankings that are extremely accurate at predicting the winners of games. We will continue to push this idea further in the future.