Expected Goals: What is xG in Soccer?
One of the most important advanced metrics in Soccer is the stat called Expected Goals, abbreviated xG. xG in Soccer is used to measure the quality of shots one team takes. The more shots you take at a higher quality, the more goals your team will score on average.
But we want to dive deeper into this metric and answer not only the what, but also the how and why of the stat. We’ll start by answering “what is xG”. Then we’ll move onto how expected goals is calculated. And finally, we’ll explore why it is an interesting stat by looking at the types of conclusions one can make.
What does xG Measure?
Unlike some other advanced metrics, xG is extremely easy to interpret. Expected goals measures how many goals a team should have scored on average with the shots they took. By combining certain factors such as shot location, distance, and the offensive and defensive positions, we can estimate how easy or hard a shot was. Then, one can compute the probability of a shot being made.
By doing this for every shot a team took, we can figure out how many goals the team should have been expected to score. The stat is all about averages. A team that gets more fast breaks should be expected to score more goals. A team that gets more PKs should be expected to score more goals. A team with a bad goalie who is constantly out of position should be expected to give up more goals.
At the end of the day, the point of xG is to try to take luck and random variance out of the analysis of the game. A team could get lucky and score a handful of truly improbable goals throughout the game. But this isn’t necessarily indicative of how well they played, just how lucky they got. More importantly, these lucky events are indicative of how well a team will play in the future.
Looking at xG can help give a better idea of who should have won the game instead of who actually won the game. For the baseball fans out there, the idea is similar to pythagorean expectation (check here or here to read about pythagorean expectation). In baseball, pythagorean expectation tells us how many games a team should have won based on their runs scored and runs allowed. In soccer, xG tells us how many goals a team should have scored based on the difficulty and quantity of their shots taken.
Next, we’ll look at how to compute xG in soccer.
How is xG Computed?
Expected goals in Soccer uses the ideas of logistic regression to predict the probability of a goal being scored. Logistic regression (and, therefore, xG) can be thought of as a two step process.
First, the context surrounding the shot attempt are converted into a number which estimates how difficult the shot was to make. For example, a kick from 30 feet is easier than a kick from 100 feet. A penalty kick is easier than a live-ball shot with a defender on your hip. A shot using your dominant foot is easier than a header. Each of these factors – along with more – contribute to an overall rating of shot difficulty.
After this, the shot difficulty number is converted to a probability. The process through which this is done looks at thousands of previous shot attempts, their associated difficulties, and figures out the correct conversion from difficulty to probability of a goal. When the underlying statistical model is a normal distribution, this is the precise way to apply the logistic regression paradigm.
In general, the interesting part of logistic regression problems is the first half of the problem. In this case, the interesting part of the problem is measuring shot difficulty given the context of the shot. The factors which go into determining the shot difficulty include (and are dominated by, but are not necessarily limited to):
- The distance from the goal
- The closer the shot, the easier it is to make
- The angle the shot is taken from
- Shots taken from the center of the pitch are easier than shots taken from the corners
- The circumstances leading to the shot
- Rebounds from missed shots and fast breaks are easier than indirect free kicks and corners
The following graphic from the introduction summarizes how these factors lead the computation of xG in Soccer
In the next section, we’ll look at why expected goals is a valuable metric to consider
Why xG instead of Goals
The concept of xG is not unique to soccer. Hockey also has a statistic which talks about expected goals. The NFL has expected points added. I’ve suggested an NBA shooting metric which uses expected points per shot. Using expected values instead of actual outcomes fundamentally reduces variance. This leads to better predictive power in our statistics. It’s like having a filter through which we can tell the difference between how good a player is and how well they happened to play. These are subtly different ideas.
Consider the following scenario. An opposing goalie is feeling ill and is playing well below his standards. A player on our team takes a few low-probability shots that happen to go in. You look at the box score and see that they scored 3 goals, they must have had a great game right? However, it is likely that the 3 goals scored is more attributable to the goalie playing badly than our player playing well. The 3 goals in the box score might overstate how good of a game our guy had.
On the other hand, if the goalie is playing well our team as a whole might not score at all! Even if we have many good scoring opportunities, a shutout will tell the story that our team played very badly. Over the course of a season, there are likely to be just as many advantageous lucky events as their are disadvantageous ones. Be careful here, though, to steer clear of the gambler’s fallacy. Because a goalie had a good game against us last week doesn’t mean they’re more likely to have a bad game against us this week.
The long and short of it is that sometimes the outcome of certain plays are outside of your own control. Measuring what should have happened by looking at xG in Soccer can help us better understand how well each team played.
While at its core, expected goals are meant to reduce the variance in looking at goals scored and allowed, some other useful conclusions can be made by using such stats.
How to Use Expected Goals
xG is meant to reduce variance which results from the inconsistency and rarity of goal scoring in soccer. However, there are some subtleties to using statistics like these which look at long-term averages.
Perhaps the most notable way to apply xG in Soccer is to look at the difference between actual goals scored and expected goals scored. Remember that xG looks at historical data and averages out what has happened for similar shots in the past. This means that the probability of a goal being scored is implicitly assuming that a player of average quality took the shot!
If a player consistently over a long period of time scores more goals than expected, then this means that the player is adding some shot probability via their own skill. If a player consistently scores fewer goals than expected, then this means that they are doing something wrong. Either way, as the sample size grows, we can use the difference between xG and actual goals to determine how good a player is at turning their shot opportunities into points.
This is an extremely valuable data point when determining how good an offensive player is. The same technique in reverse can be used to determine how good goalies are. If a goalie consistently lets in fewer goals than expected, then they are adding value to their team. If they let in more goals than expected, then they aren’t a very good goalie.
However, one needs to be careful in performing the above analyses. The expected goals formula takes into account explicitly the positioning of the goalie. Therefore, if a goalie is out of position, then they will actually be penalized much for letting a goal in because it was a harder save because they were out of position! But, being out of position was their fault to begin with so they should be punished statistically.
Therefore, to really nail this analysis, I would actually support developing more expected goals metrics which are independent of a specific player’s impact. For example, in order to measure a goalie’s quality, we should compute the probability of a shot being made without taking into account the goalie’s positioning. This way, the goalie is rewarded for good positioning and punished for bad positioning.
A Caution When Using xG in Soccer
A second issue with using expected goals is that the way teams play can change based on the context of the game. If a team takes a quick lead in a game, the chances are that they’ll play more conservatively for the remainder of the game. Instead of pressing aggressively on offense, they’ll be more likely to hang back on defense.
This is not unlike in American football where teams are willing to allow more yards and easier plays to help run out the clock at the end of games. This is not unlike in the NBA or in college basketball when a team pulls its starters when they have a big lead. We’re talking about garbage time.
Expected goals can be skewed by this event. One team can appear to have a better game than they actually did if this happens. More importantly, certain players who are likely to see more chances in garbage time can have skewed xG values.
At the end of the day though, expected goals is a great statistic and a good place for advanced soccer analytics to start.
To receive email updates when new articles are posted, use the subscription form below!