深度强化学习算法为城市排水系统实时控制提供了新的技术手段。然而,在基于深度强化学习的控制策略的构建与应用过程中,可能存在多种不同来源的不确定性对控制策略的性能造成影响。目前针对基于深度强化学习的城市排水系统实时控制策略受不确定性影响程度的评估缺乏系统性的框架,如何模拟各类不确定性、如何量化不确定性对控制性能的影响是两个需要解决的核心问题。针对以上问题,本研究设计了一套针对基于深度强化学习的城市排水系统实时控制的不确定性影响评估方法,可分别模拟控制输入信号、深度强化学习算法和环境交互模型三个层面的不确定性,通过大规模数学实验的方式测定控制策略在仿真模型上的控制性能,采用统计分析的方法量化各类不确定性的影响程度。本研究选择苏州市城西片区排水系统为研究案例,基于Dueling Double Deep Q-Network深度强化学习算法构建控制策略,运用本研究提出的方法评估不确定性对其影响。对控制输入信号不确定性影响评估的结果显示,能否准确预测降雨过程趋势和降雨雨型是影响深度强化学习策略性能的重要因素,在工程中应重点关注。若预测雨型完全不准确,至多可能造成控制策略60.41%的控制性能损失,更易导致环境污染和经济损失;若预测雨型与实际降雨雨型一致,降雨预测信号中含±25%范围内服从均匀分布的偶然不确定性,性能指标最大波动幅度下降至38.01%;若仅存在±25%范围内的系统不确定性,性能指标最大波动幅度减小至7.14%。泵站集水井监测液位信号中偶然和系统不确定性的影响分别小于和大于降雨预测信号同类不确定性的影响。深度强化学习算法自身随机性使得平行训练的多个控制策略之间存在显著性能波动,多个降雨事件下测试的性能指标分布的极差与均值的比值平均为35.67%。将深度强化学习算法与集成学习方法结合可提高其稳定性,上述比值降低至8.22%。在实际工程中也可以通过加大计算资源投入、训练多个策略后从中选优的方法克服算法随机性的影响。SWMM模型的汇水区面积参数和管道曼宁粗糙系数的不确定性对于训练的控制策略性能有负面影响,控制性能的降低幅度平均在百分之十以内,不及前两类不确定性的影响。说明在SWMM模型已经过率定和有效性验证的前提下,SWMM模型参数中仍旧存在的不确定性对于基于深度强化学习的控制策略的影响较为不显著,可不必优先关注。
Deep reinforcement learning provides a new solution for real-time control of urban drainage systems. However, deep reinforcement learning-based control strategies may be impacted by uncertainties from various sources throughout the construction and application phases of control strategies. Currently, no systematic evaluation framework for the impacts of different uncertainties on deep reinforcement learning-based control strategies is available. How to simulate various uncertainties and how to quantify their impacts on control performance are two core issues that need to be solved.Regarding the problems described, this study designed an uncertainty impact evaluation framework for deep reinforcement learning-based real-time control of urban drainage systems, which provides methods to simulate the uncertainty of control input signals, the uncertainty of deep reinforcement learning algorithms, and the uncertainty of environmental interaction models used for deep reinforcement learning models’ training. The control strategies’ performance can be measured through large-scale mathematical experiments conducted on a simulation model of the studied drainage system, and the impacts of uncertainties can be quantified by statistical analysis. The drainage system of the Chengxi District of Suzhou was chosen as the research case. Control strategies based on the Dueling Double Deep Q-Network algorithm were constructed, on which the uncertainty impact evaluation was carried out using the framework proposed in this research.The results of the impact evaluation of the uncertainty of control input signals show that whether the rainfall trend and rainfall pattern can be accurately predicted is an important factor affecting the performance of deep reinforcement learning-based control strategies. Taking the overflow risk of the drainage system as the control performance indicator, if the predicted rainfall pattern is completely inaccurate, even if the predicted total rainfall depth is exact, the maximum relative change of the control performance indicator compared to the baseline scenario without any uncertainty can reach 60.41%. If the predicted rainfall pattern is consistent with that of the actual rainfall event, but the rainfall intensity forecast signals contain random uncertainty that is uniformly distributed within the range of ±25%, the maximum relative change of the performance indicator is reduced to 38.01%. If the random uncertainty is changed to the systematic uncertainty within the range of ±25%, the maximum relative change of the performance indicator is reduced to 7.14%. The impacts caused by the random and systematic uncertainties in the monitoring water depth signals of collection wells of pumping stations are smaller and larger than those of the same type of uncertainty in the rainfall intensity forecast signals, respectively. The intrinsic randomness of the deep reinforcement learning algorithm causes significant performance fluctuation among several control strategies trained in parallel. The ratio of the range of the control strategies’ performance indicators to their mean performance is 35.67% on average across multiple testing rainfall events. Combining the deep reinforcement learning algorithm with ensemble learning methods can improve its stability, and the above ratio can be reduced to 8.22%. In practice, the impact of deep reinforcement learning algorithms’ randomness can also be overcome by increasing the investment of computing resources, training multiple control strategies, and then selecting the best one for the application. During the training of the deep reinforcement learning model, the SWMM model of the urban drainage system should be used as an interactive environment. The uncertainties of the subcatchment area parameter and the Manning’s roughness coefficient of the pipes may cause the learned control strategy to only be suitable for the biased model, resulting in a decrease in the performance compared to the control strategy trained by using the SWMM model with accurate parameters. The average performance decrease is about a few percent. The impact caused by model parameter uncertainty on deep reinforcement learning-based control strategy is less significant than that of the uncertainty in control input signals and that of the intrinsic randomness of the deep reinforcement learning algorithm. It indicates that under the premise that the SWMM model has been calibrated and validated, the uncertainty still existing in the parameters of the SWMM model has a less significant impact on deep reinforcement learning strategies, and may not require priority attention.