登录 EN

添加临时用户

数据驱动的电商平台商品需求预测研究

Data-driven Demand Forecasting on E-commerce Platforms

作者:雷达洲
  • 学号
    2019******
  • 学位
    博士
  • 电子邮箱
    lei******com
  • 答辩日期
    2023.12.20
  • 导师
    祁炜
  • 学科名
    管理科学与工程
  • 页码
    109
  • 保密级别
    公开
  • 培养单位
    016 工业工程系
  • 中文关键词
    需求预测,数据驱动,函数型数据分析,贝叶斯回归,迁移学习
  • 英文关键词
    Demand forecast,Data-driven,Functional data analysis, Bayesian regression,Transfer learning

摘要

随着互联网电商平台的飞速发展,在线零售在居民的日常消费之中占有越来越高的比重。而在电商平台的日常运营中,准确的产品销量预测至关重要,影响到产品选择、库存计划、物流计划等一系列环节。本文聚焦于数据驱动的电商平台商品需求预测问题,重点考虑其中的新品预测、大型促销预测以及基于迁移学习的预测方法等重要问题。 在进行商品需求预测时,新发布产品的需求预测无疑是难度最大的。新品没有历史销售数据,本身具有极强的不确定性,但所有产品都必然经历新品阶段。这就使得新品预测成为了电商平台需求预测中无法避开却又极具挑战的问题。本文的第一个研究为了解决此问题,利用函数型数据分析工具完成了对于新品生命周期曲线的有效建模,提出了贝叶斯函数回归的预测框架,并且将聚类方法、分类器等手段有机嵌入框架之中,实现了准确的新品售前预测以及售后预测更新。基于真实数据集的数值实验证明了该框架优异的预测性能,研究结论对进一步研究提供了有效参考。 大型促销作为电商平台的一个标志性销售活动,在为电商平台带来巨大关注度和营业额提升的同时,也显著增加了需求预测的难度。本文的第二个研究建立了大型促销窗口期内的产品需求模型,利用基于小波变换的函数型数据分析将复杂的销量曲线进行了分解,并将分解所得的不同成分同现实世界的诸多特征分别对应起来。该模型有效实现了产品的静态属性和高频协变量信息的分离,更好地解释了大型促销期间产品销量曲线的形态,为准确地进行产品需求预测打下了良好的基础。基于该模型,研究设计了相应的贝叶斯预测框架,并且在真实数据集上进行了数值实验,实验结果很好地展现了模型与预测框架的有效性。 多层级数据变得越来越易得,如何在大数据背景下更好地利用不同层次的数据来辅助预测成为了一个热点问题。本文的第三个研究从迁移学习的视角出发,设计了利用品类层级的销量数据辅助产品层级预测的池化提升框架,该框架将线性模型中的联合估计器同梯度提升树的正则化过程有机地结合起来,将在由品类层级数据处理所得的代理数据集上的训练结果作为产品层次模型训练时的正则化项,有效地降低样本外的预测误差。研究从理论上证明了这种预测框架的误差边界,并且通过多个真实数据集的检验展示了其预测性能的优越性。

The rapid development of e-commerce platforms has made online retail an increasingly large part of daily consumption for residents. Accurate product demand forecasting is crucial for the day-to-day operation of e-commerce platforms, as it affects product selection, inventory planning, logistics planning, and other processes. This thesis focuses on the data-driven demand forecasting problem for e-commerce platform products, emphasizing important issues such as new product forecasting, big event promotional forecasting, and prediction methods based on transfer learning. Predicting product demand is a challenging task, particularly for newly released products without historical sales data. Although all products inevitably go through the new product stage, forecasting demand for new products remains an unavoidable and challenging problem in e-commerce platform demand forecasting. This thesis proposes a solution to this problem by utilizing functional data analysis tools to effectively model the new product lifecycle curve. The proposed Bayesian function regression prediction framework integrates clustering methods, classifiers, and other means to achieve accurate new product pre-launch forecasting and after-launch update forecasting. Numerical experiments based on real datasets have demonstrated the framework‘s excellent predictive performance, providing valuable knowledge and insights for further research. Big event promotions are a significant feature of e-commerce platforms that have increased the difficulty of demand forecasting and brought more attention and turnover to these platforms. The second research in this thesis establishes a product demand model within the big event promotion window. It uses functional data analysis based on wavelet transform to decompose complex sales curves and match the different components obtained from the decomposition with many features in the real world. This model effectively separates the static features and high-frequency covariates of products. Thus, it provides a better explanation of the shape of the product sales curve during big event promotions and lays a good foundation for product demand forecasting. Based on this model, a corresponding Bayesian prediction framework was designed, and numerical experiments were conducted on real datasets to demonstrate the effectiveness of the model and prediction framework. In the context of big data, multi-level data is increasingly available, and the question of how to better utilize different levels of data to improve predictions is natural. The third study in this thesis proposes a pooling and boosting framework that leverages category-level sales data to assist in product-level predictions using transfer learning. This framework combines a joint estimator in the linear model with the regularization process of the gradient boosting tree. It uses the training results on the proxy dataset obtained by processing category-level data as the regularization term for product-level model training, which effectively reduces the out-of-sample prediction error. The study proves the error bound of this prediction framework theoretically and demonstrates its superior forecast performance through several real datasets.