The Secretary Problem and Sports Contracts
How could something called “the secretary problem” be at all useful in sports analytics?
Pretend you’re Aaron Judge this offseason. You’re pretty sure that all 30 teams are going to make you an offer to come play for them. One-by-one you listen to their offers and at the end of each meeting give the team a yes-or-no answer. Judge’s goal is to get the largest contract he can.
What strategy should Aaron Judge employ? How can he ensure he doesn’t say yes right before he hears an even better offer?
In this article we’ll teach you about the secretary problem and show how it can be used to solve Aaron’s quandary.
To receive email updates when new articles are posted, use the subscription form below!
What is the Secretary Problem and Why is it Difficult?
The secretary problem is also sometimes known as the Sultan’s dowry problem or the marriage problem. No matter the name, the secretary problem is all about picking the best option when we’re only allowed to evaluate the choices one at a time sequentially.
The traditional formulation – and the reason for the name the secretary problem – comes from the following hypothetical. A corporate executive needs to hire a secretary and receives a number of resumes. He plans to interview them one at a time and make a decision whether or not to hire them immediately following the interview.
After a candidate is rejected, they cannot be called back and offered the job. This means that it is possible to accept a suboptimal candidate because the best candidate was later in the process. It is also possible to accidentally decline the best candidate while holding out hope for someone better later down the line.
The difficulty is the incomplete information available at decision time. We know how good the current candidate is and how good all the previous candidates were. However, we don’t know how good the yet-to-be-interviewed candidates are. Therefore if we want to pick a candidate, we are giving up some information.
To put that another way, in the moment we have no way of knowing if a given candidate is the best or not.
In the Sultan’s dowry version of the problem, a potential spouse gets to marry one of the Sultan’s 100 daughters. The suitor must look at the daughters one-by-one and decide if she is the one to marry.
If you look closely enough, the Sultan’s dowry formulation and the secretary problem formulation are mathematically identical. That is, if we strip away the fluff and the storytelling, at the end of the day the underlying mathematical question is the same. The same is true for Aaron Judge’s problem.
The secretary problem simplifies to this: we observe (or flip over) a sequence of numbers one by one and stop when we think the most recently flipped number is the largest in the list.
Some Terminology for the Secretary Problem
Before going further, we need some terminology. Mathematicians often define certain objects or phenomena to streamline later discussion.
Borrowing the terminology introduced here, we refer to a “candidate” as a number that is larger than all the previously flipped numbers. Why is this an important thing to define? The goal is to find the largest number in the list. One property we know that the largest number satisfies is that it will be larger than all previously flipped numbers.
Examples make things clearer. Suppose the sequence of numbers we will be flipping contain the digits 1-10 in a random order. Remember that the person flipping the numbers doesn’t know they contain 1-10! Consider the following two orders.
- Order 1: 5, 8, 3, 4, 2, 1, 7, 10, 9, 6
- Order 2: 3, 6, 1, 4, 7, 8, 2, 9, 10, 5
In the first order, the candidates are 5, 8, and 10 because they are the only numbers in the order larger than everything previously. In the second, the candidates are 3, 6, 7, 8, 9, and 10.
Notice that if we force ourselves to stop flipping only when we reach a candidate, we have a much higher chance of finding the right number. Also notice that the person flipping the numbers has all the information available to determine whether or not a number is a candidate in the moment.
Some Strategies for the Secretary Problem
It is pretty easy to come up with strategies to solve this problem using the language of candidates. Most strategies revolve around selecting a candidate some amount of time into the process of flipping numbers. The goal of these strategies is to tell us how to prune down the list of candidates even further. Two types of strategies we might consider are:
- Pick the r^{th} candidate we encounter
- Pick the first candidate after the first r numbers are flipped over.
It turns out that the optimal strategy is to follow the second strategy! This is a mathematical fact you can prove. But what should we pick for r ? How long should we wait?
The Optimal Solution
It is not too difficult to show – though we will not do so here – that the optimal strategy for the secretary problem is as follows. You are given a list of N numbers to be flipped over one-by-one and the goal to stop when the highest card is flipped over. The best way to play this game is to flip over the first r numbers, record the largest value from among these r , and select the first number you come across that exceeds this value.
The only remaining question is “how should we select the value r ?” Wikipedia has a pretty good derivation showing how to solve this problem. (For those out there perhaps surprised by this, it is actually a common observation by professionals in a technical field that Wikipedia is a superb resource. This stands starkly contrasted to what your 10th grade English teacher told you.)
Fix the amount N of numbers to be flipped in our game. For each r (the length of time to wait before potentially sticking with a number), you can compute the probability that you win the game by selecting the highest number in the list! This probability is given by \frac{r-1}{N}\sum_{i=r}^N \frac{1}{i-1} .
Now, a little optimization experience will let you see that this formula is enough to come up with the optimal value of r!. This is simply a discrete optimization problem which can be optimized by, for example, brute force. That is, you can plug in each value r=2,\dots, N-1 and see which value gives you the highest probability of winning!
This brute force approach is fairly reasonable for small N . If you analyze the speed of this algorithm, it behaves like O(N^2) asymptotically. Personally, I wouldn’t be terribly surprised if there were a clever O(NlogN) algorithm to solve this problem.
However, what happens when N gets really large? How will you solve this problem? It turns out that the limiting behavior of this problem is fascinating.
A Fascinating Limit
What happens as the number N of possible choices grows arbitrarily large? I, for one, expected the probability of winning this game to approach 0. But, in an extremely fascinating twist, this is not the case.
First, interpret the sum \frac{r-1}{N}\sum_{i=r}^N \frac{1}{i-1} as a Riemann sum. Then, if you take the limit as N goes to infinity, the fraction satisfies \frac{r-1}{N}\sum_{i=r}^N \frac{1}{i-1} \approx \rho \int_\rho^1 \frac{1}{x}dx . Here, \rho is now the proportion r/N .
This integral is actually trivial to evaluate and using Fermat’s theorem to optimize the choice of \rho leads to a solution.
In fact, picking \rho=1/e is the optimal solution in the limit. Moreover, the probability of winning the game by picking the highest number in the list is also 1/e . This is perhaps my favorite example where the number e shows up in a completely unexpected place. (My other favorite is in studying derangements – a way of shuffling a list so nothing remains in the same place. In fact, both the derangement problem and the secretary problem deal with lists in random orders, perhaps there is a connection.)
This means that as your list gets really long, you should look at the first 35% and record the highest value. Then, the number you decide to stick with should be the first number you flip higher than anything in the first 35%.
Returning to Aaron Judge’s Problem
How should Aaron Judge choose the best contract? If all 30 teams offer him a contract in a random order and he has to say yes/no immediately, how should he proceed?
Aaron’s best bet is to listen to the first 11 offers and note the best offer from among them. He should decline the first 11 offers no matter what! Call the best offer among the first 11 the best initial offer. Then, the first offer from the remaining 19 offers that exceeds the best initial offer is what Aaron Judge should accept. Somehow, I doubt this is what Aaron will do.
To receive email updates when new articles are posted, use the subscription form below!