登录 EN

添加临时用户

基于数据驱动模型的河网水质预测研究

Research on Water Quality Prediction of River Network Based on Data-Driven Models

作者:文思敏
  • 学号
    2018******
  • 学位
    硕士
  • 电子邮箱
    272******com
  • 答辩日期
    2021.05.26
  • 导师
    曾思育
  • 学科名
    环境科学与工程
  • 页码
    101
  • 保密级别
    公开
  • 培养单位
    005 环境学院
  • 中文关键词
    城市河网,水质预测,数据驱动模型,方案比选
  • 英文关键词
    urban river network, water quality prediction, data-driven models, scheme comparison and selection

摘要

水质模拟和预测是水环境规划管理和水污染综合防治的关键内容之一。过去的研究通常采用机理模型进行水质模拟和预测,但机理模型在实际应用中的复杂性和不确定性高。为了准确、简单和高效地对城市河网水质进行预测,本研究构建基于数据驱动模型的河网水质预测方法,以苏州市高新区狮山、横塘街道为案例区,通过河网水质预测模型方案设计和结果分析,比选最优模型与应用方案,提出在河网水质预测方面应用数据驱动模型的方法学指导。本研究以16号点位的氨氮作为预测变量,以不同点位的氨氮、CODMn、电导率、溶解氧、水温、水位、降水量等作为输入变量,通过对现有方法的比较和梳理,提出构建用于河网水质预测的数据驱动模型的基本方法过程。首先进行数据获取与准备、数据检查与清洗,然后分别构建MLP神经网络、LSTM神经网络和ARIMA三种模型。对于神经网络模型,建模过程涉及构造输入-输出子序列、划分数据集、数据归一化、采用Keras深度学习框架配置模型等。对于ARIMA模型,建模过程涉及划分数据集、平稳性检验、自相关性检验、识别模型的形式和阶数、采用statsmodels库以及pmdarima库构建模型等。本研究中神经网络的模型方案从超参数和输入-输出变量组合两个方面进行设计。结果表明,最优方案为批处理大小取8、各层激活函数均取ReLU、输入时滞为1步等;并且两种模型具有长时预测的能力。神经网络模型可用于:①出于节省设备成本或难以管理等原因不对氨氮进行长期监测时预测氨氮值;②检查氨氮的监测数据;③预测未来的氨氮值。其中,MLP神经网络还可用于预测不同点位的氨氮值;而LSTM神经网络不适用于预测不同点位的氨氮值,但能够更准确地检查监测数据、预测未来的氨氮值。在手动和自动构建的四个ARIMA模型中,自动构建的滚动预测模型为最优模型;ARIMA模型仅在进行滚动预测时具有较好效果,不适用于连续长时预测。三种模型最优方案的反归一化氨氮RMSE值在0.063~0.132的范围内,R2在0.910~0.995的范围内。应用结果表明,使用数据驱动模型能够对河网水质进行结构简单、计算高效、结果准确的预测。

Water quality simulation and prediction is one of the key contents of water environment planning and management as well as comprehensive prevention and control of water pollution. In the past, phenomenological models are usually used for water quality simulation and prediction, but phenomenological models are complex and uncertain in application. In order to predict the water quality of the urban river network in an accurate, simple and efficient way, this study constructs a river network water quality prediction method based on data-driven models, taking the Shishan and Hengtang sub-districts in High-tech District, Suzhou as the case area. Through the scheme design of the river network water quality prediction models and result analysis, the optimal models and schemes are compared and selected, and methodological guidance for the application of data-driven models in river network water quality prediction is proposed.In this study, ammonia nitrogen at point 16 is used as an output variable, and ammonia nitrogen, CODMn, conductivity, dissolved oxygen, water temperature, water level, precipitation, etc. at different points are used as input variables. Through comparing and combining the existing methods, this study proposes the basic methodological process of constructing data-driven models for river network water quality prediction. First, data acquisition and preparation as well as data inspection and cleaning are carried out, and then three models of MLP neural network, LSTM neural network and ARIMA are constructed respectively. For the neural network models, the modeling process involves constructing input-output sub-sequences, dividing data sets, data normalization, and configuring networks using the Keras deep learning framework. For the ARIMA model, the modeling process involves dividing data sets, stationarity test, autocorrelation test, identifying the form and order of the model, and building the model using the statsmodels library and the pmdarima library.In this study, the schemes of neural network models are designed from two aspects: hyperparameters and input-output variable combinations. The results show that the optimal scheme is 8 for the batch size, ReLU for activation functions of each layer, 1 time step for the time lag of input, etc.; and the two models are able to perform long-term prediction. The neural network models can be used to: ①predict ammonia nitrogen values when the ammonia nitrogen is not monitored for a long time for reasons such as saving equipment costs or difficulty in management; ②check monitoring values of ammonia nitrogen; ③predict future ammonia nitrogen values. The MLP neural network can also be used to predict ammonia nitrogen values at different points; the LSTM neural network is not suitable for predicting ammonia nitrogen values at different points, but it can check monitoring values and predict future ammonia nitrogen values more accurately. Among the four ARIMA models constructed manually and automatically, the automatically constructed rolling prediction model is the optimal model; the ARIMA model only works well for rolling prediction, and is not suitable for continuous long-term prediction.The denormalized RMSE of ammonia nitrogen of the optimal schemes of the three models is in the range of 0.063~0.132, and R2 is in the range of 0.910~0.995. The application results show that the data-driven models can predict the water quality of the river network with simple structures, efficient calculation and accurate results.