Latent Dirichlet Allocation for Predicting User Churn Reasons Based on APP Negative Comment Text

Journal: Modern Economics & Management Forum DOI: 10.32629/memf.v6i6.4649

Xuan Ding

Renmin University of China, Beijing 100872, China

Abstract

Based on the negative comment text of mobile applications, this research proposes a framework integrating Latent Dirichlet Allocation (LDA) with duration modeling to predict not only when users churn but also the underlying reasons for their departure. The study aims to address a gap in existing literature, which often focuses solely on churn probability or timing, by incorporating thematic analysis of user-generated content from app stores. Methodologically, it employs the Duration Model to estimate the time until churn and extends it into a Competitive Risk Model that accounts for multiple churn reasons (categorized as controllable, uncontrollable, and unknown risks). The LDA algorithm is utilized to extract latent topics from negative reviews, transforming unstructured text into interpretable variables for predictive modeling. These variables, alongside duration data, are integrated into the model to enhance prediction accuracy. Evaluation will involve metrics such as perplexity for LDA performance and log-likelihood, AIC, and BIC for model comparison. The research expects to contribute a novel, text-enhanced approach to churn prediction, offering practical insights for internet companies to better understand churn drivers, improve user experience, and design targeted retention strategies, despite potential limitations in LDA's performance on short text.

Keywords

Latent Dirichlet Allocation (LDA), user churn, negative comment

References

[1] Peter, S. F., Bruce G. S. H. (2009). Customer-base valuation in a contractual setting: The perils of ignoring heterogeneity. Marketing Science, 29(1):85-93.
[2] Xu, X., Thong, J.Y., Venkatesh, V. (2014). Effects of ict service innovation and complementary strategies on brand equity and customer loyalty in a consumer technology market, Information Systems Research, 25(4): 710–729.
[3] Gupta, S., Lehmann, D. R., & Stuart, J. A. (2004). Valuing customers. Journal of Marketing Research, 41: 7-18.
[4] Keaveney, S. M. (1995). Customer switching behavior in service industries: an exploratory study. Journal of Marketing, 59(2), 71-82.
[5] Cameron, A., Trivedi, P. (2005). Microeconometrics, Cambridge University Press.
[6] Braun, M., & Schweidel, D. A. (2011). Modeling customer lifetimes with multiple causes of churn. Marketing Science, 30(5): 881-902.
[7] Caigny, A. D., Coussement, K., Bock, K. W. D., & Lessmann, S. (2019). Incorporating textual information in customer churn prediction models based on a convolutional neural network. International Journal of Forecasting, 36: 1563–1578.
[8] Blei, D. M., Ng, A., & Jordan, M. I. (2003). Latent dirichlet allocation. The Journal of Machine Learning Research, 3: 993–1022.

Copyright © 2025 Xuan Ding

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License