Every year, three Premier League teams face the biggest punishment there is in football: relegation. Often comes with it huge losses in revenues and an exodus of their best players, who seemingly prefer Champions League football to playing away at Bolton or Brentford on a Wednesday night (not sure why).
At the beginning of the season, experts try to predict who will face relegation, and who will stay in the Premier League. Often, their predictions are based on not much more than opinions (that are not always well informed). But can the Bayes theorem help predict the outcome of the season, before it even happens?
The Bayes theorem aims to describe the probability that an event occurs, by analyzing independent factors. It gives the conditional probability of an event, which is more accurate than the initial probability. For instance, without looking at any factors, we know that the probability of relegation of a Premier League team is of 15% (3/20). However, this is not a realistic assessment, since some teams are much more likely to be relegated than others. By analyzing various factors, we can come up with a more accurate prediction. To learn more about the theorem, view the link at the end.
In our case, three factors were chosen: The transfer expenses (M$) during the summer, the position in the table in the previous season and the number of British players in the team. For each factor, data was regrouped in groups since the Bayes Theorem works with qualitative data.
The transfer expenses factor was chosen because it represents how much investment has been made by a team to improve the squad.
The position in the standings in the previous season was chosen because in theory, teams that were bad the year before should be bad again. For this factor, only the bottom half teams of the table (11-17 and the three new promoted teams) were analyzed, given that they are the ones actually threatened by relegation (the chances of a top 6 team being relegated are practically null).
The number of British players was chosen because in the Premier League, too many British players shows a lack of investment overseas, quite rarely good for results. Also, British players tend to be overvalued (or I should say, always are), likely for marketing reasons.
From the 2014-2015 season to the 2017-2018 seasons, the 3 new promoted teams along with the bottom half teams of the previous year (10 teams per year) were analyzed, for a total of 40 data points (12 relegation, 28 survivals).
We see that spending more money has not been a guarantee of success. Teams spending too much might be investing poorly and bringing too many new players, which disrupts team chemistry. We also see that teams that just survived relegation the year before and newly promoted teams tend to find themselves in the heart of the battle the year that follows. Finally, having more British players in the squad to start the season seemingly implies increased chances of relegation, as teams that had the most British players were relegated twice as much, for the same number of observations.
Here are, for each team, the chances of relegation that the Bayes Theorem would have predicted at the beginning of the season, based on the three factors used.
As of April 23rd, Huddersfield and Fulham have already been relegated, while Cardiff seem destined to be heading back to the Championship with them. These three teams were in the 5 likeliest to be relegated, when considering our factors and the Bayes Theorem. The biggest surprise is Wolves, fighting for a place in the top half of the Premier League and clearly exceeding expectations!
Link to the Bayes Theorem explanation