How Unpredictable was the 2015/16 Premier League season?
Posted 20th May 2015
With 5,000/1 shots Leicester City winning the 2015/16 Premiership, and previous Champions Chelsea finishing 10th, many have commented that this season has been one of the most unpredictable. But just how unpredictable has it been in comparison to previous seasons, and could we expect to see this sort of thing again?
One method to assess predictability (or the lack of it) uses what is called the Brier Score, which measures the accuracy of probabilistic predictions. In this case, these are defined by the betting odds. With the bookmaker's margin removed, odds of 2.00, for example, imply an outcome probability of 50% (or 0.5). As explained by Dominic Cortis for Pinnaclesports.com the Brier score per match is simply the sum of the square difference of the probability and actual results, where outcomes are assigned scores either of 1 (it happened) and 0 (it didn't happen). For example, the estimated probabilities for the game between Manchester United and Bournemouth on Tuesday 17th May 2016 were 0.592 (home win), 0.236 (draw) and 0.172 (away win). With Manchester United winning the game, the Brier score is calculated as follows:
(0.592 - 1)² + (0.236 - 0)² + (0.172 - 0)² = 0.251
Evidently, for 3-way propositions like home-draw-away, the Brier score can range from 0, where we predict an outcome that we believe has a 100% probability proves to be correct, to 2, for the same prediction but which fails to occur. For random propositions where home, draw and away are assumed to be equally likely, the Brier score will be 0.667 irrespective of the outcome. Dominic Cortis interprets a score of over 0.667 as representing a relative surprise.
The table below shows the average Brier scores for teams that played in the 2015/16 Premier League. Performances which appear to have been more predictable include Manchester City and Arsenal at home and Aston Villa and Norwich away. Performances which appear to have been more of a surprise include Chelsea at home and West Ham and Leicester away.
On closer inspection, however, can we really trust this method to be measuring exactly what it we have in mind with respect to unpredictability? Leicester City's average Brier score, for example, is 0.625, broadly the same as Crystal Palace. The former were shock 5,000/1 winners, the latter finished 15th, more or less where most pundits would probably have expected them to. Furthermore, the average Brier score across the division is 0.620, but hypothetically setting every result to a home win lowers this to 0.512. Of course, 380 home wins would be about as surprising as anything could possibly be. Indeed based on the fair odds of each match, the probability is in the region of 3 x 10-147.
Another way of measuring the unpredictability of the season is to compare actual points collected by each team to their expected points total. This is something Scoreboard Journalism has been doing for the past three seasons. The method involves calculating the standard deviation in the differences between actual and predicted points across the 20 Premiership teams. If all predictions were perfect, the score would be zero. Alternatively, if predictions are made randomly then the score typically ranges from about 12 to 19, depending on the spread of the actual points. Lower scores, therefore, are a sign of greater predictability. Let's call this score the 'Unpredictability Score'.
Scoreboard Journalism ask pundits, modellers and fans to submit their predictions for points total before the start of each season. Another way we can estimate expected points totals is to use actual match betting odds of the 380 games played. Obviously, this then becomes a retrospective exercise, since we don't know what the match odds are for all 380 games until they have been played. Furthermore, the expected points calculated by this method will represent Bayesian estimates, given that the probabilities implied by the match betting odds reflect the changing performances of teams as the season evolves. Nevertheless, this is arguably still a useful exercise.
Calculating expected points for a team in a single game is straightforward: simply multiply the probability of each outcome by the points awarded for that outcome, and sum across all possible outcome. For example, for Manchester United v Bournemouth where implied probabilities for home, draw and away were 0.592, 0.236 and 0.172 respectively, the expected points total for Manchester United would be (0.592 x 3) + (0.236 x 1) + (0.172 x 0) = 2.012. Similarly for Bournemouth, their expected points total for this game would be 0.752. We then sum expected points totals for each team's 38 games to arrive at an expected final table points total. Finally it's a simple task of calculating the difference between actual and expected season points totals and the standard deviation across those differences for the 20 teams. The figures for the last 10 seasons are shown below.
Evidently by this measure, the 2015/16 season was more unpredictable than the 9 seasons which preceded it. Only 2011/12 gets close. But just how unpredictable was 2015/16? One way to find out is to consider the range of actual points minus expected points figures for all teams across the 10-season period. With 20 teams we have 200 individual values for the difference between actual and expected season points totals. These ranged from -18 to +16 with one exceptional outlier at +28, which unsurprisingly was Leicester City in 2015/16. The mean was -0.16 and the standard deviation 6.56.
In fact these values are roughly normally distributed (71% of values are within +1 or -1 standard deviations of the mean, whilst 95% are within +2 or -2). Knowing this, we can use a Monte Carlo simulation to create a large number of random samples of 20 teams scores, each one simulating a single season for which an unpredictability score can be calculated. This approach is broadly similar to the bootstrapping technique which is used to make inferences about the nature of a population of data when the sample from which it comes is small.
My Monte Carlo simulation produced 100,000 runs for 100,000 unpredictability scores. These ranged from a low of 2.58 to a high of 13.32, a mean of 6.39 and a standard deviation of 1.26. Just 0.44% of them were higher than the unpredictability score of 2015/16, telling us that this season was arguably a 1-in-200 year event. Of course, much of the surprise factor has arisen because of the performances of Leicester City (+28), Chelsea (-18) and West Ham (+15). Remove them, and the season's unpredictability score drops to 6.33. With Leicester City winning the Premiership, Chelsea finishing 10th and West Ham repeatedly beating superior teams and losing to inferior ones, we probably already knew this anyway. Nevertheless, this method had at least attempted to quantify just how surprising the season has actually been. As a final note, it's worth observing that Leicester City's outlier value of +28 (for actual minus expected points) has a return expectation of about 1-in-33,000.