Customer churn prediction is a key challenge in the business industry, where retaining existing customers is vital for sustained growth. This study explores the application of machine learning techniques on a dataset of 7,043 customer records, utilizing comprehensive preprocessing methods including outlier detection through Quartile Calculation, and class balancing via SMOTE and random oversampling. A comparative analysis of Logistic Regression, Decision Tree, Support Vector Machine (SVM), Random Forest, and Artificial Neural Network (ANN) models was conducted. Initially, Logistic Regression demonstrated superior performance; however, after methodological adjustments, the Random Forest model achieved the highest predictive accuracy of 89%. These findings emphasize the critical role of data preprocessing in enhancing model effectiveness. Future research will focus on deep learning methods, advanced feature selection, and optimization strategies to further improve churn prediction accuracy.
Tools: Techniques: Logistic Regression, Decision Tree, Support Vector Machine (SVM), Random Forest, SMOTE (Synthetic Minority Over-sampling Technique), ANN, Quartile Calculation. Tools: Google Colab, Excel
Department: Department of Mathematics
Project Poster
.pdf.png)