Building a Baseball Elo Model using Betting Data
We’ve discussed previously how to use Vegas data to build models for NFL games. Can we build a baseball Elo model using the same ideas? The simple idea is that Vegas point spreads give us information about how good teams are relative to each other. Aggregating this data over lots of games lets us rank all the teams relative to each other by solving a system of equations.
We are going to continue this trend but apply the ideas to a different sport. In this article we build a baseball Elo model to predict the outcome of MLB games. To predict outcomes on day X, we let ourselves use Vegas lines from day X-1, X-2, all the way back to the beginning of the season.
The tool we use is Elo ratings. We assign an Elo rating to each team and each starting pitcher. Then we use these ratings to predict probability of one team winning. The secret sauce is in converting Vegas spread data to Elo ratings. Because we use Vegas data to compute Elo ratings, we’re calling this Baseball Elo model the Implied Elo Model. The process for how to do all the calculations is contained at the tail end of this article. We begin with the results.
Quick Elo Refresher
Elo ratings are perhaps better known as the “chess rating system”. In an Elo rating system, each team gets a numeric score. Higher ratings refer to better players/teams/etc. The player/team/etc. with the higher rating is expected to win in a match. The larger the Elo rating difference, the bigger the gap in team quality.
Typically, a player/team/etc.’s Elo rating is determined by “stealing” points from teams you beat and “losing” points to teams who beat you. Heavy favorites steal only a very small amount when they win but lose a ton when they get beaten. The gains and losses in 50-50 matchups are more equal.
In our model, we’re going to assign an Elo rating to both a team and their starting pitcher. Because pitcher’s influence outcomes so heavily and because they change every game, this is a necessary change. Before describing the math behind our Vegas implied Baseball Elo Model, lets look at the results.
Baseball Elo Model Results
First, let’s test how our Baseball Elo model does qualitatively. We’re going to begin by looking at which teams and pitchers our model identifies are “the best” and see if these results pass common wisdom. We ran the numbers for the 2021 baseball season to run this first analysis.
The table below contains the top 5 teams and pitchers from the 2021 season according to the Implied Elo Model. Note that in these rankings “team” refers to all aspects of a team excluding starting pitchers. That is, “Team” rankings are composite rankings of a team’s offense and bullpen.
Rank | Team (Elo Rating) | Pitcher (Elo Rating) |
---|---|---|
1 | Dodgers (+77) | Yu Darvish (+104) |
2 | Rays (+55) | Joe Musgrove (+94) |
3 | Braves (+55) | Shane Bieber (+88) |
4 | Astros (+49) | Gerrit Cole (+87) |
5 | White Sox (+44) | Blake Snell (+82) |
Let’s compare these ratings to other stats. What if we look at the offensive output of these 5 teams? Then the teams rank as follows: Dodgers (4th), Rays (2nd), Braves (8th), Astros (1st), and White Sox (7th). This seems to work pretty well and accurately reflects reality!
What about the pitchers? For this stat we’ll rank the pitchers by cumulative win probability added. We get: Yu Darvish (238th), Joe Musgrove (78th), Shane Bieber (74th), Gerrit Cole (8th), and Blake Snell (165th).
What happened with the pitchers? To me, it seems like the player’s performance from the 2020 season impacted these predictions. In 2020, Darvish finished 3rd, Musgrove 205th, Bieber 1st (!!), Gerrit Cole 42nd, and Blake Snell finished 67th.
Pitching is very swingy and players can have seemingly bad seasons out of nowhere. This means that Vegas – and all models, that is – can have a hard time gauging which pitchers are the most impactful in a given season. I think refining this Baseball Elo model to better predict pitcher impacts will be valuable, but for now let’s stick with this model.
Prediction Power
The ultimate test of whether or not this idea is a good idea is to see how accurate the resulting Baseball Elo model is. To test the accuracy of the Implied Elo Model, we need something to compare against. We’ll use the following methods to predict baseball games:
- Implied Elo Model
- Pick the home team to win
- Pick the team with the better Pythagorean expectation
- Pick the Vegas favorite to win
Models 2-4 above are ordered in “increasing complexity” and accuracy. We expect Vegas to be the most accurate. Then, the Pythagorean expectation method should be the next most accurate. Finally, picking the home team to win should be more than 50% accurate, but not a very good model otherwise.
Finding out where the implied Elo model ranks in this hierarchy is a good measure of how good this model is. To test the overall Baseball Elo model accuracy, we pulled the data from all games in the 2021 MLB season and tested how accurate each of these methods were
The table below contains the accuracy of each of these models.
Model | Home Team | Pythagorean | Implied Elo | Vegas |
---|---|---|---|---|
Accuracy | 53% | 55% | 58.4% | 59.0% |
This is basically exactly what we expected to see! The implied Elo model does better than the other stats including one “advanced stat”. However, the Implied Elo model does worse than just using the Vegas opening line to pick the winners.
One other thing to look at is if there are differences between using the opening line or closing line.
Open/Closing Line Difference
It is a constant argument of which is more accurate: the opening line or the closing line in Vegas. The opening line relies only on the computerized models to predict the winner. The closing line uses public opinion to shift the line so it truly reflects “wisdom of the masses”. So which is more accurate, the computers or the public?
The table below contains data answering this question. First, we looked at how accurately the opening line picked the correct winners versus how often the closing line did. Then, we looked at the Implied Elo model’s accuracy using either opening line data or closing line data.
Accuracy of Models | Implied Elo | Vegas |
---|---|---|
Opening Line | 58.4% | 59.0% |
Closing Line | 58.9% | 59.6% |
This data definitively shows that closing lines are more accurate than opening lines in baseball. Perhaps even more interestingly, our implied Elo model is basically just as accurate as the Vegas opening line in picking the winners of baseball games. That means that this model is a very good starting point for anyone that wants to build a model to consistently beat Vegas. If we can more accurately predict which pitchers have the highest impacts, this model should consistently outperform the Vegas opening lines, giving opportunities for profitable betting!
The remainder of this article describes the mathematics of how to use Vegas data to infer Elo ratings for teams and players. Generally audiences can safely stop reading now, but those interested in continuing this line of research or developing their own models should read the description below.,
Using Money Lines to Assign Elo Ratings
The data we rely on is the Vegas opening line. The opening line can be converted to an implied probability of the home team winning. Call this E_H . The line also gives an implied probability of the away team winning E_A . Because of Vegas’ vig these probabilities don’t add to 1 like they should. Therefore we can renormalize and set E_H’ = \frac{E_H}{E_H+E_A} . Do the same to define E_A’.
The next step is to convert probabilities of winning to implied Elo advantages. Let T_H,P_H [katex] be the home <strong>T</strong>eam rating and home <strong>P</strong>itcher rating. Define the away ratings [katex] T_A,P_A the same way. The home team’s combined Elo rating is T_H+P_H and the away team’s combined Elo rating is T_A+P_A .
The probability that the home team wins the game in the Elo system is based on the difference between the home rating T_A+P_A and the away rating T_B+P_B . The formula that describes this relationship is E_H = \frac{1}{1+10^{(T_A+P_A-T_H-P_H)/400}} .
We want to solve a system of equations to estimate the parameters T_A, P_A, T_H, and P_H . The Elo equation can be converted into a linear equation of these variables using the formula: T_A+P_A-T_H-P_H = -400\cdot log_10 (-1+1/E_H) .
One wrinkle exists. In baseball, the starting pitcher changes from game to game. Therefore, there are usually multiple pitching Elo ratings that need to be estimated. These could be called P_{A,1}, P_{A,2} \dots .
Therefore, the goal is to solve the system of equations for all the pitcher ratings and all the team ratings. We now want to ask how many games it takes to have enough data to learn all these parameters. Assume for simplicity that each team has four starting pitchers so that each team has five variables to solve for (the team rating and each of the four pitcher ratings). Five games of data for each team is enough to solve. Therefore 150 games of data need to be observed which is about 1.5-2 weeks worth of games to come up with a solution.
Finally, home field advantage gives a bit of a boost in baseball. Home teams win roughly 53% of games across baseball history. Converting this to an Elo rating means home field advantage is worth about 20 points. This means that the home team gets a +20 Elo boost when predicting who wins.
This method for assigning ratings for our baseball Elo model Elo ratings to teams is the method we used to generate all the results in this article.
To receive email updates when new articles are posted, use the subscription form below!