Winning (and Losing!) Streaks in Baseball
Winning streaks in baseball can range from a few games to nearly a month. From the middle of August to early September in 2002, the Oakland Athletics went on an unprecedented winning streak. They won a masterful twenty games in a row. This is the longest winning streak in baseball since the second world war. This winning streak was immortalized in the famous baseball analytics movie Moneyball. Billy Beane’s unconventional strategy finally paid off as the Athletics went on one of the longest winning streaks in baseball history.
On the other hand, this year’s White Sox have been historically terrible. Between July 10th and August 5th, including the All Star Break, the Sox dropped 21 straight games. This bested their 14 game losing streak from earlier in the season.
In this article, we’re going to take a look at winning streaks in sports. On average how long are teams’ longest winning streaks? How unlikely was it that the Athletics and White Sox went on their respective streaks? And finally what do winning streaks teach us about momentum in sports?
Momentum in Sports
One of the most commonly debated topics in sports analytics is whether or not momentum actually exists. Athletes, coaches, and talking heads think there is no debate: YES momentum exists in sports. But sports analysts almost universally have found no evidence that momentum impacts outcomes in sports. The Data Jocks has studied this in the past when we looked at whether teams on hot streaks are more likely to win. That is, does momentum tend to carry over and make “wins beget wins”?
We found that the answer was no. Teams on streaks didn’t have higher chances of winning. The Data Jocks joined the ranks of analysts that have found momentum doesn’t impact sports. This is important in how we model winning streaks to judge how likely or unlikely the Athletics and White Sox streaks are.
How to Model Winning Streaks in Baseball
In order to compute how unlikely the streaks in question are, we need to know how long winning streaks are on average. The Athletics won 64% of their games. We can get a sense of the expected longest winning streak via simulation.
Think of it like this: flip a coin that comes up heads 64% of the time. In a computer, this is easy to do because we can program the computer however we want. If we flip this coin 162 times, this is a pretty good simulation for the entire season. To see how long winning streaks can get, we find the longest streak of heads in our 162 flips. If we do this lots of times, we can compute the average longest streak as well as the standard deviation.
We’ll also do the same thing to simulate the losing streak for the White Sox. Instead of using a coin that comes up 64% heads, we’ll use one that comes up tails 77% of the time.
Where does momentum come in? Because momentum doesn’t really exist, we can flip the coin independently each time. The coin has non memory because it has no momentum.
Expected Longest Baseball Streaks
Our code simulated 1000 seasons for teams at all different winning percentages and computed their average longest winning streak. We also computed the standard deviation to know what level of variation is normal. The plot below contains the data we found.
Notice that as a team’s winning percentage increases, so too does their expected longest streak. For teams near .500, they shouldn’t expect to have a winning streak longer than about 6 or 7 games, on average. But the league’s best teams with a near 70% win rate should be expected to win 12 games in a row on average.
The same chart can tell us about losing streaks if we replace the x-axis with “Losing percentage” and the y-axis with “Longest losing streak”. The math is the same and the results are the same. This information is useful in analyzing the White Sox streak. The table below summarizes our findings.
Mean | Standard Deviation | |
---|---|---|
Athletics Winning Streak | 9.6 | 2.7 |
White Sox Losing Streak | 15.8 | 4.8 |
The Likelihood of the Streaks
Given a mean and a standard deviation, if we assume the data is normally distributed then calculating its probability is easy using z-scores. Doing this for the Athletics yields a z-score of 3.85. The White Sox z-score is much more tame at 1.1.
A z-score of 1.1 means that there was about a 13% chance that the White Sox were going to have a losing streak of at least 21 this year. Even though the Sox streak was longer than the expected 15-16 games, it is not that unlikely.
On the other hand, the Athletics streak was much less likely. A z-score of 3.85 corresponds to a probability of .01%. This is a 1 in 1000 type event. The probability that the 2002 Athletics went on their 20 game winning streak (or longer!) was less than 1 in 1000.
This is why the 2002 Athletics streak felt so magical but the White Sox streak was much less remarkable. The Athletics’ streak was more remarkable, it was much rarer. The White Sox streak was much more in line with expectations given how bad they are. This also explains why the Sox have had streaks of 14 and 12 in this season. Teams that are as bad as the Sox are were bound to go on losing streaks this long due to normal statistical variance.
More Accurate Calculations
The previous section assumed that the data was normally distributed. This assumption is quick and easy to generate approximate probabilities. However, the data is certainly not normally distributed. For example, negative values cannot occur.
Much more accurate would be to use the Monte Carlo simulations to estimate the percentile of the observed events. That is, simulate 162 game seasons 1000s of times and count how many times a winning streak of 20 or a losing streak of 21 happens.
Doing this for the White Sox reveals a 13.7% chance of occurrence. However, for the Athletics the probability is much higher than we otherwise estimated. Instead of being a 1 in 1000 event, it is now a 1-in-150 event. This makes much more sense because, over the roughly 150 history of baseball, this streak happened exactly once.
Another way to make the calculations more accurate is to change the probability of each coin flip to respect different win probabilities in different matchups. The 2002 Athletics won 64% of their games. However, their win probability before each game was not 64%. If they play a good team, it might be closer to a 50-50 matchup. If they play a bad team, they might be favored more heavily than 64%. We won’t simulate this effect because it is quite a bit more difficult, but we would expect this effect to moderate streak length. That is, we would expect it to make streaks shorter as more games get pushed into “tossup” range.
To receive email updates when new articles are posted, use the subscription form below!