Publication Date
5-2025
School
School of Business
Major
Business Administration
Keywords
Regression Analysis, Sports Analytics, Predictive Modelling, Data Analysis, Random Forest Regression, Multiple Linear Regression, RStudio, MLS, Soccer
Disciplines
Business Analytics | Databases and Information Systems | Data Science | Sports Management | Sports Studies
Recommended Citation
Madeti, Joshua Clement, "Comparative Analysis of Regression and Random Forest Models for Player Performance Prediction in the MLS" (2025). Senior Honors Theses. 1529.
https://digitalcommons.liberty.edu/honors/1529
Abstract
Advanced technology and analytics have transformed the world and have benefited several industries throughout, the sport industry being one of them. Data is constantly generated during sports and requires post-game or post-season analysis which is crucial to team and player success. In this paper, the researcher will focus on the impact of analytics on soccer and soccer players. With over three billion active fans, soccer is the most famous sport in the world yet, when it comes to analytics, it is lagging. The thesis includes a comparative study of multiple linear regression and random forest regression to explore whether these models can be used to accurately predict player goals and assists and evaluate which factors were significant predictors. The results of the analysis indicated that, for this particular dataset, random forest regression is a more accurate technique for predicting player performance.
Included in
Business Analytics Commons, Databases and Information Systems Commons, Data Science Commons, Sports Management Commons, Sports Studies Commons