[This article belongs to Volume - 57, Issue - 11]
Gongcheng Kexue Yu Jishu/Advanced Engineering Science
Journal ID : AES-04-12-2025-891

Title : Hybrid Churn Prediction and Retention Model for E-Commerce Industry Using Hidden Markov Model
Arti Ranjan, Arvind Kumar, Santosh Kumar,

Abstract : In the e-commerce industry, where keeping current customers is frequently more cost-effective than recruiting new ones, customer churn is a major problem. In order to improve the precision and interpretability of churn prediction, this paper presents proposed hybrid framework that blends probabilistic modeling, deep learning, and stacked ensemble learning. SMOTE and ADASYN oversampling approaches are used to correct class imbalance in a real-world e-commerce dataset that includes demographic, transactional, and behavioral data. While deep learning models like CNN, RNN, and FCNN showed better recall and validation accuracy, with CNN reaching up to 83% following Keras-Tuner optimization, traditional models like Random Forest and Logistic Regression only managed baseline accuracies of about 79%. Individual models exceeded by 2.2% to 5.3%, a model that is proposed, used Gradient Boosting, CatBoost, XGBoost and SVM with Logistic Regression as a meta-classifier, reached 96% of high test accuracy post-ADASYN. Hidden Markov Models (HMMs) with 2, 4, and 6 latent states were used to further examine its churn probabilities that reflected the temporal dynamics of consumer behavior. HMM framework's ability to track dynamic states and identify churn-prone pathways supported the Individualized risk assessments and focused retention efforts. The suggested proposed model framework provides a scalable and interpretable solution, for real-world churn management, with excellent predictive performance and useful insights.