巢湖学院学报 ›› 2023, Vol. 25 ›› Issue (3): 79-85.doi: 10.12152/j.issn.1672-2868.2023.03.010

• 信息科学 • 上一篇    下一篇

基于贝叶斯优化的XGBoost模型在电信用户流失中的应用

王亚歌,江家宝,王洪海:巢湖学院 计算机与人工智能学院   

  1. 巢湖学院 计算机与人工智能学院,安徽 巢湖 238024
  • 收稿日期:2022-10-31 出版日期:2023-05-25 发布日期:2023-10-24
  • 作者简介:王亚歌(1993—),女,河南泌阳人,巢湖学院计算机与人工智能学院助教,主要从事数据挖掘、机器学习研究。
  • 基金资助:
    安徽省“四新”研究与改革实践项目(项目编号:2021sx102);安徽省高校科学研究项目(项目编号:KJ2020A0681)

Application of XGBoost Model Based on Bayesian Optimization in Telecom User Churn

WANG Ya-ge,JIANG Jia-bao,WANG Hong-hai:School of Computing and Artificial Intelligence, Chaohu University   

  1. School of Computing and Artificial Intelligence, Chaohu University, Chaohu Anhui 238024
  • Received:2022-10-31 Online:2023-05-25 Published:2023-10-24

摘要: 为了提高企业利润,降低运行成本,需要对用户流失行为进行预测,针对可能流失的用户提前进行精准营销,挽留用户。建立XGBoost模型对用户流失数据进行训练,计算出输入特征的重要性排序,选择Top-K特征,得到新训练集。一方面,基于训练集建立贝叶斯优化的XGBoost模型,利用贝叶斯优化寻找最优参数;另一方面,选取8种模型用来建模并验证模型,分别用精确率、准确率、召回率和F1值对模型进行评估。经实验验证,在电信用户流失预测上基于贝叶斯优化的XGBoost模型较其他模型具有更好的预测结果和更高的效率。

关键词: 贝叶斯优化, XGBoost, 用户流失

Abstract: In order to improve the profits of enterprises and reduce operating costs, it is necessary to predict the loss of users, and carry out precise marketing in advance to retain users. The XGBoost model is established to train the user churn data, and the importance ranking of input features is obtained. The Top-K feature is selected to obtain a new training set. On the one hand, the XGBoost model of Bayesian optimization is established based on the training set, and the optimal parameters are found by Bayesian optimization; On the other hand, 8 models are selected to construct and verify the model, and the model is evaluated in precision, accuracy, recall and F1 value respectively. Experimental results show that XGBoost model based on Bayesian optimization has better prediction results and higher efficiency than other models in telecom user churn prediction.

Key words: Bayesian optimization, XGBoost, user churn

中图分类号: 

  • TP181