My First Foray into Golf Analytics: 2020 Masters Predictions

In honor of the upcoming event, I wanted to dip my toe into the world of golf analytics in order to deliver some 2020 Masters predictions. Golf is actually really well studied and has a heavy dose of analytics already baked in. Golfer’s themselves are very open to the advantage they may gain with analytics. Knowing which club to use, whether to take an aggressive approach or not, and other things can lead to real, demonstrable improvements in a player’s game.

I am definitely going to be exploring the Golf world more in the coming days, but this is my first attempt at working with Golf data. In this article, I will use data from the 2020 season to make my 2020 masters predictions.

Course Effects and Golf

There are a few really interesting aspects to the golf world that make analytics an interesting problem. However, overwhelmingly these intricacies come down to one thing: differences between holes and courses.

Golfers around the league have vastly different profiles. Generically, golf comes down to three phases: driving, midrange/approach, and short-game/putting. The thing with golf, though, is that a golfer’s profile has a significant impact on how he performs on various courses. Let me give an extreme example.

If I were a golfer and driving was my best attribute, I wouldn’t enter into a par 3 tournament because the impact of my best skill is diminished. Similarly, if I am a great putter and a lousy driver, I would perform best in tournaments that maximize the amount of time I spend with a putter in my hand.

If we want to use these ideas in our analysis (and believe me, we do) we need to be able to determine:

  • Which golfers are good at what phases of the game, and
  • On a given course, who (which type of golfer) may be expected to benefit the most.

This is certainly something we will be exploring and explaining in more detail in the days to come.

In this introductory article, I wanted to make two attempts at creating a world golf rankings by combining all data from the 2020 season. Then, I will comment on the Masters course profile and how to combine that with my rankings in order to make 2020 Masters predictions.

Methodology: Simple Model

The first, simplest attempt at ranking golfers is simply a way to translate all the results from this year into a single number that represents how good the golfer has been relative to the field this year on average. There are two main difficulties in ranking golfers. First, not every golfer plays in every tournament so we don’t always have as much comparative data as we would like. Second, because courses play at so many different difficulty levels, it can be really hard to get an apples-to-apples comparison between performances at different courses. Sometimes -2 wins the tournament, sometimes -2 misses the cut.

What did we do? We looked inside of each tournament and turned each player’s raw score into a normalized version which measures how well they did relative to the rest of the field. Specifically, given the list of scores relative to par, we compute each person’s z-score. If \mu is the mean score in the tournament, and \sigma is the standard deviation, and x is their score relative to par, each golfer’s z-score is given by the formula: z=\frac{x-\mu}{\sigma}

A z-score tells you how many standard deviations above or below the mean tournament score your performance was. The more negative a number is, the better you did for the tournament.

What z-scores do is they normalize your performance to the difficulty of the course so that we can – at least in the simplest sense – compare outings across tournaments. Most importantly they measure your performance relative to the rest of the field so that disproportionately difficult or easy courses don’t skew our rankings. Finally, we’ll take every golfer’s average z-score over all the tournaments they played in in order to see how many standard deviations above/below the field they were on average. In the next section we’ll present our golf rankings according to this technique in order to make our 2020 masters predictions.

2020 Golf Rankings: Lite Model

The following table shows how our model would present the 2020 world golf rankings. Remember: the numbers in the table below are ‘number of standard deviations above/below the mean tournament score averaged over all tournaments played in 2020’.

GolferRating
Jon Rahm-1.10
Xander Schauffele-1.09
Justin Thomas-1.06
Rory McIlroy-0.93
Dustin Johnson-0.90
Patrick Cantlay-0.89
Bryson DeChambeau-0.86
Webb Simpson-0.83
Daniel Berger-0.83
Patrick Reed-0.83
Harris English-0.80
Wesley Bryan-0.72
Tony Finau-0.71
Tyrrell Hatton-0.60
Adam Scott-0.58
Viktor Hovland-0.56
Collin Morikawa-0.56
Russell Henley-0.56
Hideki Matsuyama-0.54
Lee Westwood-0.53
Cameron Tringale-0.53
Brian Harman-0.53
Sungjae Im-0.50
Matthew Fitzpatrick-0.48
Scottie Scheffler-0.48
Doc Redman-0.48
Stewart Cink-0.45
Abraham Ancer-0.43
Ian Poulter-0.43
Lanto Griffin-0.42
Louis Oosthuizen-0.40
Joel Dahmen-0.40
Denny McCarthy-0.39
Bud Cauley-0.38
Sebastian Munoz-0.37
Harold Varner III-0.37
Matthew Wolff-0.36
Zach Johnson-0.33
Adam Hadwin-0.32
Mark Hubbard-0.32
Talor Gooch-0.31
Bubba Watson-0.31
Alex Noren-0.31
Sergio Garcia-0.30
Adam Schenk-0.30
Ryan Palmer-0.29
Patrick Rodgers-0.29
Sam Burns-0.27
Will Gordon-0.26
Maverick McNealy-0.26
Joaquin Niemann-0.26
John Huh-0.25
Scott Stallings-0.25
Cameron Davis-0.25
Chez Reavie-0.25
J.T. Poston-0.24
Paul Casey-0.23
Si Woo Kim-0.23
Chesson Hadley-0.22
Gary Woodland-0.22
Charley Hoffman-0.21
Brandon Wu-0.20
Matt Kuchar-0.19
Martin Laird-0.19
Billy Horschel-0.18
Richy Werenski-0.18
Dylan Frittelli-0.17
Cameron Champ-0.16
Henrik Norlander-0.16
Scott Piercy-0.16
Brian Stuard-0.15
Tom Hoge-0.15
Adam Long-0.15
Christiaan Bezuidenhout-0.15
Brendon Todd-0.14
Michael Gligic-0.13
Brooks Koepka-0.12
Kevin Kisner-0.11
Shane Lowry-0.11
Carlos Ortiz-0.10
Rory Sabbatini-0.09
Jason Kokrak-0.09
Luke List-0.08
Harry Higgs-0.08
Matthew NeSmith-0.08
Alex Cejka-0.07
Kevin Na-0.07
Jordan Spieth-0.07
Brendan Steele-0.06
Rhein Gibson-0.06
Cameron Smith-0.06
Austin Cook-0.06
Rickie Fowler-0.06
J.B. Holmes-0.06
Kevin Chappell-0.05
Erik Van Rooyen-0.05
Troy Merritt-0.05
Corey Conners-0.04
Keegan Bradley-0.04
Jhonattan Vegas-0.03
Byeong Hun An-0.03
Brice Garnett-0.03
Mackenzie Hughes-0.03
Brandt Snedeker-0.03
Cameron Percy-0.03
Jason Dufner-0.03
Peter Uihlein-0.03
Ryan Moore-0.03
Jim Furyk-0.03
Beau Hossler-0.02
Charles Howell III-0.01
Xinjun Zhang-0.01
Tommy Fleetwood-0.01
James Hahn-0.00
Matt Jones-0.00
Matt Wallace 0.00
Kevin Streelman 0.01
Robby Shelton 0.01
Tyler Duncan 0.02
Max Homa 0.02
Bronson Burgoon 0.03
Hank Lebioda 0.04
Nick Watney 0.04
Lucas Glover 0.05
Ricky Barnes 0.05
Vincent Whaley 0.05
Pat Perez 0.06
Charl Schwartzel 0.06
Jason Day 0.08
Tyler McCumber 0.08
Chris Kirk 0.08
Seamus Power 0.08
Sepp Straka 0.08
Chase Seiffert 0.09
Kyoung-Hoon Lee 0.10
Brandon Hagy 0.10
Justin Rose 0.11
Joseph Bramlett 0.12
Mark Anderson 0.12
George McNeill 0.12
Kristoffer Ventura 0.12
Tiger Woods 0.13
Rob Oppenheim 0.13
Scott Harrington 0.13
Tom Lewis 0.14
Emiliano Grillo 0.14
Tim Wilkinson 0.16
Chris Baker 0.16
Phil Mickelson 0.18
Ryan Brehm 0.19
Fabian Gomez 0.19
Nick Taylor 0.19
Luke Donald 0.20
Andrew Landry 0.20
Peter Malnati 0.20
Vaughn Taylor 0.20
Kurt Kitayama 0.20
Bernd Wiesberger 0.20
Danny Lee 0.20
Wyndham Clark 0.20
Bo Hoag 0.21
David Hearn 0.21
Zac Blair 0.21
Keith Mitchell 0.21
Patton Kizzire 0.23
Tommy Gainey 0.23
J.J. Spaun 0.24
Steve Stricker 0.24
Bill Haas 0.24
Ryan Armour 0.25
Rafael Campos 0.25
Shawn Stefani 0.25
Anirban Lahiri 0.26
Grayson Murray 0.26
D.J. Trahan 0.26
Zack Sucher 0.26
Hudson Swafford 0.26
Russell Knox 0.27
Sung Kang 0.27
Aaron Baddeley 0.27
Roberto Castro 0.28
Kyle Stanley 0.29
Sam Ryder 0.29
Wes Roach 0.30
Ben Martin 0.31
Jonathan Byrd 0.32
Kramer Hickok 0.32
Ben Crane 0.33
Justin Suh 0.33
Brian Gay 0.33
Nate Lashley 0.34
Camilo Villegas 0.34
Dominic Bozzelli 0.34
Scott Brown 0.34
Matthias Schwab 0.34
Doug Ghim 0.35
Jamie Lovemark 0.35
C.T. Pan 0.35
Michael Gellerman 0.36
Andrew Putnam 0.37
Aaron Wise 0.37
Roger Sloan 0.38
Marc Leishman 0.38
Sebastian Cappelen 0.40
Boo Weekley 0.41
Michael Thompson 0.41
Jimmy Walker 0.41
Ben Taylor 0.42
Seung-Yul Noh 0.42
Chris Stroud 0.42
Josh Teater 0.43
Robert Garrigus 0.43
Danny Willett 0.45
Matt Every 0.45
Jim Herman 0.46
Morgan Hoffmann 0.46
Lucas Herbert 0.46
Branden Grace 0.49
Johnson Wagner 0.51
Sean O'Hair 0.52
Arjun Atwal 0.52
John Merrick 0.53
Chad Campbell 0.55
Robert Streb 0.58
Kevin Tway 0.58
Davis Love III 0.60
Francesco Molinari 0.61
Tim Herron 0.61
Graham DeLaet 0.62
Graeme McDowell 0.64
K.J. Choi 0.66
David Lingmerth 0.69
Victor Perez 0.69
J.J. Henry 0.71
Ryan Blaum 0.72
Nelson Ledesma 0.73
Satoshi Kodaira 0.73
Vince Covello 0.73
Ted Potter Jr. 0.74
Lucas Bjerregaard 0.74
Hunter Mahan 0.75
Akshay Bhatia 0.75
Vijay Singh 0.79
John Senden 0.79
Greg Chalmers 0.81
Henrik Stenson 0.82
Shaun Norris 0.85
Sahith Theegala 0.87
Peter Kuest 0.88
Parker McLachlin 0.88
Michael Kim 0.90
Sang-Moon Bae 0.90
D.A. Points 0.90
Kiradech Aphibarnrat 0.91
Kevin Stadler 0.92
Bo Van Pelt 0.93
Andy Ogletree 0.93
Derek Ernst 0.93
Martin Trainer 0.95
Rafa Cabrera Bello 0.95
Jazz Janewattananond 0.98
Ryo Ishikawa 1.05

Now, as discussed above, because of the effect of different courses favoring different players, these rankings are not necessarily my 2020 Masters predictions. Rather, these are just a tool for you all to use in making your own predictions. Making your own predictions includes favoring certain players over others because the course at Augusta National may favor their play style more than someone else.

The correct interpretation of these rankings is ‘which golfers have performed the best so far in 2020 on an average PGA course’.

Methodology: Advanced Model

One of the biggest drawbacks of my previous model is that every tournament is weighted equally. If a star has a good outing in a tournament with a low quality field, his performance will look better ‘relative to the average’ than if all the big names had entered that tournament.

The second 2020 world golf rankings model I have developed here is based heavily on ideas I have used in the past (for NBA team rankings, for fantasy defense rankings, and for college football rankings). The only necessary perspective we need is to treat a golf tournament as a collection of ‘games’.

Let me explain what I mean.

Usually golf results are presented as a leaderboard. However, if Bryson DeChambeau finishes -8, Rory McIlroy finishes -6 and John Rahm finishes -3 we could interpret these outcomes as ‘game scores’. Bryson finished 2 strokes better than Rory so he beat Rory by 2. In the same tournament, Bryson beat Rahm by 5 and Rory beat Rahm by 3. We could interpret each of these intra-tournament matchups as individual games played so we can use the tools of differential least squares to rank the players.

Differential least squares works like this. Each player in the tour is assigned a rating. On their own these ratings are meaningless. However, when we compare two player’s ratings we get a measure of how much better one golfer is than another. For instance, suppose John Rahm is an 8 and Rory is a 7. That means that, on a league average course and over a long period of time, John Rahm is about 1 stroke better than Rory. The 8 and the 7 don’t mean anything in particular but their difference predicts a stroke difference. Hence ‘differential’.

How do we compute these ratings? Well, we measure performance over previously played tournaments. The ratings are assigned to best match the data we have previously observed (where best match is in a least squares sense, hence differential least squares). If Rahm is an 8 and Rory is a 7, that means that over the entire data set we have taken in, Rahm has been about 1 stroke better than Rory on average. It is reasonable to expect that past performance is a good barometer of future performance.

One obscurity we need to mention. If a player doesn’t make the cut, they are considered a DNF, essentially. They are not awarded a final round score. However, missing the cut is a definitively negative event. Therefore, any player who misses the cut in an event is given the equivalent of 1 + the worst round score of the event over the four rounds of the tournament.

2020 Golf Rankings: Advanced Model

I used the above technique to rank the golfers in the world based only on their 2020 performances so far. That is, the input to this model is all tournaments played in the 2020 calendar year. Here are my 2020 world golf rankings which we will use to make our 2020 Masters predictions.

GolferRating
Jon Rahm 13.35
Webb Simpson 13.15
Bryson DeChambeau 12.52
Patrick Cantlay 12.42
Rory McIlroy 12.41
Xander Schauffele 12.37
Justin Thomas 12.09
Daniel Berger 11.95
Dustin Johnson 10.71
Tony Finau 10.51
Tyrrell Hatton 10.38
Viktor Hovland 10.29
Harris English 10.22
Patrick Reed 10.08
Wesley Bryan 9.88
Ian Poulter 9.31
Adam Scott 9.19
Collin Morikawa 8.69
Sungjae Im 8.38
Abraham Ancer 7.93
Adam Hadwin 7.52
Hideki Matsuyama 7.44
Brian Harman 7.29
Alex Noren 7.26
Lee Westwood 7.12
Matthew Fitzpatrick 7.03
Matthew Wolff 6.79
Sergio Garcia 6.73
Doc Redman 6.63
Louis Oosthuizen 6.58
Cameron Tringale 6.42
Russell Henley 6.01
Lanto Griffin 5.96
Scottie Scheffler 5.87
Joel Dahmen 5.85
Ryan Palmer 5.81
Paul Casey 5.78
Kevin Na 5.76
Matt Kuchar 5.66
Denny McCarthy 5.59
Gary Woodland 5.58
Mark Hubbard 5.39
Stewart Cink 5.09
Joaquin Niemann 5.09
Tiger Woods 4.95
Talor Gooch 4.89
Cameron Champ 4.75
Bernd Wiesberger 4.70
Adam Schenk 4.52
Sebastian Munoz 4.48
Harold Varner III 4.46
J.T. Poston 4.45
Billy Horschel 4.42
Adam Long 4.40
Bud Cauley 4.39
Zach Johnson 4.38
Maverick McNealy 4.33
Scott Stallings 4.28
Sam Burns 4.05
Si Woo Kim 3.92
John Huh 3.83
Richy Werenski 3.77
Jordan Spieth 3.70
Dylan Frittelli 3.68
Brendan Steele 3.64
Chez Reavie 3.60
Christiaan Bezuidenhout 3.59
Tom Hoge 3.52
Kevin Kisner 3.45
Jason Day 3.38
Brandon Wu 3.34
Brendon Todd 3.30
Shane Lowry 3.28
Henrik Norlander 3.16
Charley Hoffman 3.10
Rickie Fowler 3.04
Cameron Smith 3.03
Bubba Watson 3.01
Corey Conners 3.00
Cameron Davis 2.98
Keegan Bradley 2.98
Tommy Fleetwood 2.97
Matt Wallace 2.89
Troy Merritt 2.87
Will Gordon 2.86
Chesson Hadley 2.85
Brian Stuard 2.78
Rory Sabbatini 2.66
Patrick Rodgers 2.63
Max Homa 2.54
Scott Piercy 2.49
Ryan Moore 2.42
Brandt Snedeker 2.34
Matt Jones 2.24
Martin Laird 2.14
Carlos Ortiz 2.12
Matthew NeSmith 2.11
J.B. Holmes 2.03
Marc Leishman 2.03
Tyler Duncan 1.98
Charles Howell III 1.67
Kevin Streelman 1.56
Jason Kokrak 1.53
Michael Gligic 1.53
Mackenzie Hughes 1.32
Phil Mickelson 1.32
Jason Dufner 1.24
Luke List 1.15
Kevin Chappell 1.04
Nick Taylor 1.01
Lucas Glover 1.01
Alex Cejka 0.97
Jim Furyk 0.90
Harry Higgs 0.88
Byeong Hun An 0.83
Charl Schwartzel 0.81
Pat Perez 0.76
Brooks Koepka 0.66
Justin Rose 0.61
Bronson Burgoon 0.44
Robby Shelton 0.42
Sepp Straka 0.41
Hank Lebioda 0.40
Austin Cook 0.39
Andrew Landry 0.30
Erik Van Rooyen 0.29
Jhonattan Vegas 0.23
Emiliano Grillo 0.16
Sung Kang 0.13
Peter Uihlein 0.12
Lucas Herbert 0.09
Chris Kirk 0.06
Xinjun Zhang -0.02
Beau Hossler -0.08
Danny Lee -0.17
Kyoung-Hoon Lee -0.21
Matthias Schwab -0.25
Brice Garnett -0.33
Kurt Kitayama -0.50
Tyler McCumber -0.50
Chase Seiffert -0.53
Vincent Whaley -0.69
Tom Lewis -0.77
Nick Watney -0.82
Seamus Power -0.96
Andrew Putnam -0.97
Tim Wilkinson -1.11
Michael Thompson -1.25
Danny Willett -1.30
Steve Stricker -1.33
Rhein Gibson -1.40
James Hahn -1.45
Joseph Bramlett -1.56
C.T. Pan -1.58
Keith Mitchell -1.67
Chris Baker -1.69
Patton Kizzire -1.71
George McNeill -1.73
Kristoffer Ventura -1.76
Brandon Hagy -1.76
Sam Ryder -1.83
Peter Malnati -1.83
Zac Blair -1.85
Vaughn Taylor -1.88
Wyndham Clark -2.02
Scott Harrington -2.06
Ryan Armour -2.07
Cameron Percy -2.07
Camilo Villegas -2.12
Fabian Gomez -2.21
Luke Donald -2.22
Russell Knox -2.38
Victor Perez -2.38
Rob Oppenheim -2.39
Mark Anderson -2.48
Jim Herman -2.58
Ryan Brehm -2.59
Bo Hoag -2.60
Aaron Baddeley -2.62
Kyle Stanley -2.68
Brian Gay -2.68
Ben Martin -2.91
Grayson Murray -2.93
Nate Lashley -2.97
Ricky Barnes -3.12
Shaun Norris -3.13
Bill Haas -3.22
J.J. Spaun -3.41
David Hearn -3.42
Anirban Lahiri -3.44
Wes Roach -3.46
D.J. Trahan -3.51
Graeme McDowell -3.58
Tommy Gainey -3.66
Jimmy Walker -3.71
Roger Sloan -3.75
Josh Teater -3.81
Zack Sucher -3.82
Doug Ghim -3.87
Matt Every -3.93
Rafael Campos -3.96
Chris Stroud -3.98
Seung-Yul Noh -4.01
Scott Brown -4.08
Justin Suh -4.08
Hudson Swafford -4.10
Aaron Wise -4.12
Michael Gellerman -4.15
Shawn Stefani -4.23
Kramer Hickok -4.30
Sebastian Cappelen -4.57
Roberto Castro -4.64
Jonathan Byrd -4.66
Sean O'Hair -4.73
Jamie Lovemark -4.82
Kevin Tway -4.92
Ryo Ishikawa -4.95
Dominic Bozzelli -5.08
Francesco Molinari -5.11
Branden Grace -5.30
Robert Garrigus -5.31
Davis Love III -5.34
Ben Crane -5.35
Boo Weekley -5.45
Jazz Janewattananond -5.71
Ben Taylor -5.72
Arjun Atwal -5.73
John Merrick -5.91
Morgan Hoffmann -6.02
Andy Ogletree -6.19
Johnson Wagner -6.35
K.J. Choi -6.56
Chad Campbell -6.72
Robert Streb -7.32
Vijay Singh -7.64
Graham DeLaet -7.64
Henrik Stenson -7.91
J.J. Henry -8.08
Ted Potter Jr. -8.08
David Lingmerth -8.28
Satoshi Kodaira -8.29
Vince Covello -8.57
Tim Herron -8.59
Rafa Cabrera Bello -8.78
Lucas Bjerregaard -8.82
Kiradech Aphibarnrat -9.17
John Senden -9.17
Hunter Mahan -9.46
Peter Kuest -9.51
Ryan Blaum -9.52
Nelson Ledesma -9.73
Greg Chalmers -9.77
Akshay Bhatia -9.91
Martin Trainer -9.92
D.A. Points-10.51
Sang-Moon Bae-10.52
Michael Kim-10.57
Kevin Stadler-10.73
Bo Van Pelt-10.79
Parker McLachlin-10.92
Sahith Theegala-11.52
Derek Ernst-11.82

Commentary and 2020 Masters Predictions

The above table has two things to remember. First, the data only takes into account tournaments only in 2020. Because 2020 has been such a weird year, many of the best players have withheld and not played their normal slate of tournaments. This may cause some golfers to be ranked lower than they otherwise might.

Second, the above rankings are best interpreted as ‘on a neutral course’. As we discussed at length above, some course designs, layouts, and profiles will benefit different styles of golfers. So, while golfer A may be better than golfer B in general, a particular course may favor golfer B enough so that they are actually favored to beat golfer A.

We need to combine these two ideas in order to make our 2020 Masters predictions. The Masters tends to favor long-ball hitters. So, we would like to identify the best long ball hitters, combine that with our rankings for the best overall players, and use that to identify contenders.

The idea of ‘strokes gained’ (which we will certainly be returning to in the future) measures how much of an advantage one golfer gains relative to tour average on a particular shot. A nice resource is found on the PGA website explaining these ideas. In particular, we would like to identify those golfers who are strongest off the tee.

With respect to this condition, the three best drivers are DeChambeau, Rahm, and McIlroy. If you go further down the list (not even that far!) you find many other recent masters winner: Sergio Garcia and Bubba Watson. It seems pretty clear that there is a strong correlation with crushing the ball and winning the Masters. Finding the long-ball hitters should certainly be a consideration when making 2020 masters predictions.

My Masters Team

If I had to pick 5 guys I wanted for my Masters team I would start with DeChambeau, Rahm, and McIlroy. They are top 5 in the world overall and the top 3 strokes gained by driving. By virtue of this being the Masters, I think these guys are near virtual locks.

With my fourth pick, I am going to have to go with the consensus guy: Dustin Jonson. He is the 11th best long-ball hitter so far this year as well as the 9th best guy on the tour according to my Ratings.

For my last pick, I am going with Webb Simpson. My rankings have him as the 2nd best guy this year, I can’t ignore that. He does OK with the long ball, about 0.2 strokes better than league average so the Masters should at least be a reasonably strong course for him.

So there are my 2020 Masters predictions: DeChambeau, Rahm, McIlroy, Dustin Johnson, and Webb Simpson.

Acknowledgments

Thanks to my good friend, Nate Tenpas, for teaching me what I need to know about Golf to put this together. In addition to being a fellow mathematician, Nate was a collegiate golfer so he has been a great help to me.

2 Replies to “My First Foray into Golf Analytics: 2020 Masters Predictions”

Comments are closed.