A Python Data Project using H2O and Decision Trees to provide insights into the English Premier Leage and to predict the 2024 Premier League Table based on the previous 30 years of data using the standard League Table varibales.
The Premier League, is the top level of the English football league system. Contested by 20 clubs, it operates on a system of promotion and relegation with the English Football League. Seasons typically run from August to May with each team playing 38 matches. Fifty clubs have competed since the inception of the Premier League in 1992: forty-eight English and two Welsh clubs. Seven of them have won the title: Manchester United (13), Manchester City (6), Chelsea (5), Arsenal (3), Blackburn Rovers (1), Leicester City (1) and Liverpool (1).
The primary dataset used for this project is the "premier-league-tables.csv" containing the past 30 season results for the English Premier League. As well as the usual stats used in Football League Tables, this data set also included a notes column pertaining to topics such as relegation, tournament wins and extra information on certain Teams for given years.
EDA involved exploring the dataset to answer the main question and provide insights into the English Premier League.
- What will the 2024 final standings be for the English Premier League based on the previous 30 seasons?
- How compeititive and hard it is to win the EPL
Include some interesting code and functions here
The analysis results are summarized as follows: 1. 2. 3.
Based on the analsysis, I recommend the following for improving the prediction model for future use:
Kaggle for my Data Set