随着传统能源供应变得逐步紧张,利用新能源发电的必要性日益凸显。风力发电和光伏发电具有随气象因素波动的间歇性特点,在大规模并网时会对电网的运行带来严重影响,给电力系统的稳定性带来挑战。新能源发电功率的精准预测,不仅能够提高电力系统的稳定性和平衡性,降低不确定性带来的风险,还有助于电网调度人员合理制定计划,提高消纳水平。然而,针对超大数据量、波动性较大、特征数量较多的风力发电和光伏发电数据,传统的物理法、统计法等构造的模型求解速度慢,预测效果较差,直接利用机器学习和深度学习方法进行预测的效果虽有所提升,但预测精度和拟合效果还有进一步优化的空间。本文充分挖掘和分析风光电时间序列的特点,设计改进LightGBM模型和MultiCNN-BiLSTM组合模型。针对LightGBM,通过特征筛选、特征衍生、K-fold交叉验证、网格搜索四个措施对其进行优化改进。首先,对风力和光伏发电的影响因素做Spearman相关性分析,综合定性和定量分析的结果,筛选出更有价值的特征作为输入变量。然后,根据时序特征的分布特点,衍生出新的特征,获得更为丰富的表征数据。最后,使用网格搜索算法对超参数进行调整优化,寻求提升空间。每次改进实验均使用K-fold交叉验证,避免模型过拟合。根据风电和光电时间序列的特点,设计组合模型MultiCNN-BiLSTM,将能提取数据的局部特征CNN与能同时考虑正反向长时间序列特征的BiLSTM相结合,强化了周期性与非周期性特征的捕捉。首先,将不同周期的时间序列切片后再次拼接,然后使用滑动窗口与CNN结合去提取时间序列的有效数据特征,接着通过BiLSTM进一步提取特征。从实验结果来看,LightGBM模型在每次使用改进措施后的预测精度和拟合优度均有所提升,而MultiCNN-BiLSTM组合模型的预测结果优于使用CNN、LSTM相关方法的预测结果。通过与使用传统统计学方法ARIMA、机器学习方法RF、深度学习方法BP神经网络的功率预测结果相对比,发现本文提出的改进模型的预测指标MAE、RMSE和R2均明显更优,并且在风电和光电数据集上均有较好的预测表现,具有一定的泛化能力。
As traditional energy supplies become progressively tighter, the necessity of using new energy sources for power generation is becoming more and more prominent. The intermittent nature of wind and photovoltaic power generation, which fluctuates with weather factors, can have a serious impact on the operation of power grids when connected to the grid on a large scale, posing a challenge to the stability of power systems. Accurate prediction of new energy power generation can not only improve the stability and balance of power system and reduce the risk caused by uncertainty, but also help grid dispatchers to make reasonable plans and improve the level of consumption. However, for wind power generation and photovoltaic power generation data with large data volume, high volatility and large number of features, the models constructed by traditional physical and statistical methods are slow in solving and poor in prediction. Although the effect of directly using machine learning and deep learning methods for prediction has been improved, there is still room for further optimization of prediction accuracy and fitting effect.In this paper, we fully exploit and analyze the characteristics of scenic power time series, and design an improved LightGBM model and a combined MultiCNN-BiLSTM model. For LightGBM, it is optimized and improved by four measures: feature screening, feature derivation, K-fold cross-validation, and grid search. First, Spearman correlation analysis is done for the influencing factors of wind and PV power generation, and the results of qualitative and quantitative analysis are integrated to screen out more valuable features as input variables. Then, new features are derived according to the distribution characteristics of the time-series features to obtain richer characterization data. Finally, the hyperparameters are adjusted and optimized using the grid search algorithm to seek the improvement space. Each improvement experiment is cross-validated using K-fold to avoid model overfitting. According to the characteristics of wind power and photovoltaic time series, the combined model MultiCNN-BiLSTM is designed to combine the local feature CNN that can extract data with the BiLSTM that can consider both forward and reverse long time series features to enhance the capture of periodic and non-periodic features. First, the time series with different periods are sliced and stitched again, and then a sliding window is used to extract the effective data features of the time series in combination with CNN, followed by further feature extraction by BiLSTM.From the experimental results, the prediction accuracy and goodness-of-fit of the LightGBM model improved after each use of the improvements, while the prediction results of the combined MultiCNN-BiLSTM model outperformed those using CNN and LSTM related methods. By comparing the power prediction results with those using the traditional statistical method ARIMA, the machine learning method RF, and the deep learning method BP neural network, it is found that the prediction indexes MAE, RMSE, and R-squared of the improved model proposed in this paper are significantly better, and have better prediction performance on both wind and photovoltaic datasets with certain generalization ability.