登录 EN

添加临时用户

基于样本的全局分布鲁棒优化模型及其应用

Globalized distributionally robust optimization based on samples: models and application

作者:李玥瑶
  • 学号
    2018******
  • 学位
    博士
  • 电子邮箱
    liy******.cn
  • 答辩日期
    2024.05.13
  • 导师
    邢文训
  • 学科名
    数学
  • 页码
    93
  • 保密级别
    公开
  • 培养单位
    042 数学系
  • 中文关键词
    分布鲁棒优化;样本空间;核心集;半定规划
  • 英文关键词
    distributionally robust optimization; sample space; core set; semi-definite program

摘要

分布鲁棒优化是一种处理含有不确定参数的优化问题的方法,在投资组合、供应链管理、能源调度、机器学习等领域有广泛的应用。它将不确定参数视为分布未知的随机向量,通过建立概率分布的集合,即模糊集,来处理不确定性。假定决策者只能获取不确定数据的一组样本,而没有其他信息。在传统的基于矩和支撑集信息的分布鲁棒优化中,矩约束往往通过一些统计方法给出,因此具有一定的概率保证,而对于支撑集的构建并没有较好的方法,常常只是将之假定为一个紧凸集。为了更好地刻画数据分布的范围和特点,本文提出的全局分布鲁棒优化模型基于样本数据建立一个充分大、能够包含支撑集的样本空间,以及一个或若干个包含于样本空间的核心集。样本空间包含真实概率分布的支撑集,提供了鲁棒性的保证。一个核心集可以刻画单峰分布中概率密度较大的区域,多个核心集可以刻画不确定数据的多峰结构特点。模型通过控制样本空间中随机向量到核心集的期望距离来削弱核心集之外的区域对目标函数的影响,从而可以更好地反映数据的分布特点,同时可以控制模型的保守程度。数值实验分别展示了样本空间、核心集和惩罚系数对模型的最优值、保守程度和鲁棒性的影响。 在一定假设下,对于多种常见类型的集合和距离函数,全局分布鲁棒优化模型均可以等价地转化为多项式时间可计算的半定规划模型。鉴于随着问题维数的增长,求解半定规划模型所耗费的时间会大幅增加,本文对于大规模的全局分布鲁棒优化给出了基于主成分分析的近似求解模型,在保持了半定规划模型结构的同时,降低了问题维数,减少了计算时间。关于近似效果,本文给出了最优值绝对误差的理论上界,也通过数值实验证明了实际的最优值绝对误差远低于其理论上界。在合适的近似规模下,近似模型可以在大幅降低的计算时间内得到精度可接受的最优解。因此,对于问题规模较大的情形,全局分布鲁棒优化模型仍具有应用价值。 此外,本文将全局分布鲁棒优化模型应用到了投资组合问题和多产品报童问题上,分别对于实际历史数据和模拟数据进行了数值实验。数值结果验证了全局分布鲁棒优化模型在这两个问题上均优于传统的分布鲁棒优化模型和随机规划模型,证明了其对于具有厚尾分布和多峰分布特点的数据均有着良好的表现和应用价值。

Distributionally robust optimization is a methodology used for optimization problems affected by uncertain parameters, widely used in portfolio, supply chain management, energy scheduling, machine learning and other fileds. It regards the uncertain parameter as a random vector with unknown distribution, and handles the uncertainty via constructing a set of probability distributions, called the ambiguity set. Assume that there is only a set of samples available, and there is no additional information of the uncertain data at all. In the traditional distributionally robust optimization based on the information of the moments and support set, the moment constraints are offen derived by statistical methods and therefore provide probabilistic guarantee. However, there is no appropriate method for estimating the support set. It is supposed to be a compact convex set in lots of related works. In order to better describe the scope and characteristics of the data distribution, the globalized distributionally robust optimization proposed in this dissertation constructs a sufficiently large sample space that can contain the support set, and one or more core sets contained in the sample space, based on samples. The sample space contains the support set of the true probability distribution, thereby provides the robustness guarantee. One core set can capture the region with high probability density of a unimodal distribution, while multiple core sets can capture the multimodal characteristics of the uncertain data. It weakens the impact of the region outside the core sets on the objective function via controlling the expected distance of random parameters from the core sets. Hence the distribution characteristics of the data are better captured, and the degree of conservatism of the model is controlled simultaneously. The numerical experiments show the influence of the sample space, core set and penalty coefficient on the optimal value, degree of conservatism and robustness of the model.Under some assumptions, the globalized distributionally robust optimization models can be equivalently reformulated as semi-definite programs for many common types of sets and distance functions, which are computationally tractable in polynomial time. Considering that the calculation time of semi-definite programs increases greatly with the increase of the dimension of the problem, an approximation based on principal component analysis for large-scale globalized distributionally robust optimization model is provided, which keeps the structure of the semi-definite program and reduces the dimension of the problem, thereby reduces the calculation time. In addition, the theoretical upper bound of the absolute error of the optimal value is provided to measure the quality of the approximation. The numerical experiments show that the actual absolute error of the optimal value is considerably lower than the theoretical upper bound. Furthermore, the approximation can obtain the optimal solution with acceptable accuracy within the significantly reduced calculation time, with appropriate approximation scale. Therefore, the globalized distributionally robust optimization models still have application value even the dimension of the problem is large.In addition, we apply the proposed globalized distributionally robust optimization models to the portfolio and multi-item newsvendor problem. The numerical experiments based on the real historical data and simulated data are provided respectively. The numerical results show that the globalized distributionally robust optimization models are superior to the traditional distributionally robust optimization models and stochastic programming model, and verify their good performances and application value for the data with the characteristics of positive-kurtosis and multimodal distribution.