在线上电商消费呈现消费渠道多元化、强需求驱动商品消费线上化新趋势的背景下,关注电商平台复购行为、理解用户购买行为规律性对于电商运营具有重要意义。同时,对于同品类复购时间的把握能为电商平台的精细化运营提供更多的操作空间。目前对复购行为预测的研究侧重于关注用户是否会在平台复购或会复购何品类商品,而对于复购时间要素的关注相对较少;且过去对复购时间预测的研究主要从模型结构出发提升预测效果,而缺少对购买行为多方面规律性的系统刻画和应用。因此,本研究将复购时间预测问题定义为用户在目标预测时间区间内是否会复购同品类商品的分类问题,从购买行为规律性刻画入手,进行同品类商品复购时间预测,以提升模型预测能力。首先,本研究参考过往文献设计了购买行为规律特征指标体系:总结和补充了包含重复性、周期性、趋势性、集中度指标的购买时间规律特征;提出了包含同品类购前行为量级、趋势模式和频繁模式指标的购前行为模式特征;分别从用户和品类维度汇总其规律性特征并进一步交叉得到用户-品类维度特征。针对大量用户购买行为数据稀疏、购买行为规律特征有偏或缺失的问题,在默认特征填充方式的基础上,依据历史活跃度进行用户分层,探索了基于用户的品类购买频率向量的K-Means聚类方法、考虑购买间隔时间序列相似度的k近邻方法和考虑用户消费习惯模糊性的模糊聚类方法,对用户特征进行了补充。最后,比较不同基线模型的预测能力并测试新特征指标体系对于最优基线模型的提升效果。实验结果表明,本研究总结的购买行为规律特征指标体系能够有效提升复购时间预测模型的预测能力;在包含品类丰富、用户品类间差异较大的场景下,特征提升效果更为明显;对于非活跃用户和新用户,特征提升效果更为明显;且特征提升效果在不同模型、不同训练测试集和不同预测目标区间上均表现稳健。本研究一方面总结和补充了购买行为规律特征指标的刻画方法,探索了三种用户特征优化方案,对购买行为规律性量化和复购时间预测的相关文献进行了补充,具有一定的理论意义;另一方面,总结的特征指标从购买行为规律性角度丰富了对用户和品类的理解,且能够稳定提升模型对于同品类复购时间的预测能力,从而支持运营优化,具有一定的实践价值。
With the development of e-commerce, we are seeing a trend of diversification of online consumption channels and a shift of demand-driven consumption from offline to online platforms, leading to a greater requirement for understanding regularity in users’ online consumption and repurchase behavior. While gaining more information on repurchase time could provide e-commerce operators with more capacity for refined operation, studies on repurchase prediction mainly focused on whether users will buy again on the platform or what users will buy next, paying much less attention to when users will repurchase. Those studies on repurchase time prediction generally pursued better prediction results by using or ensembling different kinds of models, with few improving feature structures based on an understanding of purchase regularity. Therefore, in this study, we define the repurchase prediction problem as a classification problem of whether users will repurchase an item of the same category within a target predicting time window, and attempt to identify regularity features of purchase behavior to improve models’ predictive capability.We first summarize regularity features into two types, namely, purchasing time regularity and pre-purchase behavior pattern, including characteristics like repeatability, periodicity, trend, clumpiness, pre-purchase behavioral-trend patterns and pre-purchase frequent patterns with corresponding quantitative indicators. The regularity features are aggregated respectively on user and category dimensions, then crossed to obtain user-category dimension features. A K-Means clustering method based on purchase frequency, a KNN method based on purchase interval sequences’ similarity and a fuzzy clustering method is proposed to alleviate the bias of user-level features caused by the sparsity of purchase behavior. Then we test the efficiency of adding proposed features to the best baseline model from previous studies and further verify its robustness on different models, different train-test datasets and different target predicting time windows. Our experiment results suggest that the proposed regularity feature framework could improve the predictive capability of repurchase time prediction models; it is more efficient for scenarios with rich different categories and users, especially for predicting inactive users and new users.This study has certain theoretical significance as we summarized a feature framework to identify purchase regularity and proposed three methods to complement users’ regularity features, supplementing research on repurchase time prediction; the purchase regularity features can also enrich the image dimensions of users and categories, and improve the predictive capability of repurchase time of the same category items, thereby supporting a better-refined operation, providing practical value.