根据多剂强化学习,对可持续废水处理厂进行最佳最佳控制 (Optimal control towards sustainable wastewater treatment plants based on multi-agent reinforcement learning)

Wastewater treatment plants are designed to eliminate pollutants and alleviate environmental pollution. However, the construction and operation of WWTPs consume resources, emit greenhouse gases (GHGs) and produce residual sludge, thus require further optimization. WWTPs are complex to control and optimize because of high nonlinearity and variation. This study used a novel technique, multi-agent deep reinforcement learning, to simultaneously optimize dissolved oxygen and chemical dosage in a WWTP. The reward function was specially designed from life cycle perspective to achieve sustainable optimization. Five scenarios were considered: baseline, three different effluent quality and cost-oriented scenarios. The result shows that optimization based on LCA has lower environmental impacts compared to baseline scenario, as cost, energy consumption and greenhouse gas emissions reduce to 0.890 CNY/m3-ww, 0.530 kWh/m3-ww, 2.491 kg CO2-eq/m3-ww respectively. The cost-oriented control strategy exhibits comparable overall performance to the LCA driven strategy since it sacrifices environmental bene ts but has lower cost as 0.873 CNY/m3-ww. It is worth mentioning that the retrofitting of WWTPs based on resources should be implemented with the consideration of impact transfer. Specifically, LCA SW scenario decreases 10 kg PO4-eq in eutrophication potential compared to the baseline within 10 days, while significantly increases other indicators. The major contributors of each indicator are identified for future study and improvement. Last, the author discussed that novel dynamic control strategies required advanced sensors or a large amount of data, so the selection of control strategies should also consider economic and ecological conditions.

翻译：废水处理厂的设计是为了消除污染物和减轻环境污染,但是,WWTP的建造和运作消耗了资源,排放温室气体并产生残留污泥,因此需要进一步优化。WWTP由于高非线性和差异性而十分复杂,难以控制和优化控制和优化。这项研究使用新技术,多剂深度强化学习,同时优化WWTP中的溶解氧和化学剂量。奖励功能是从生命周期角度专门设计的,以实现可持续优化。考虑了五个设想方案:基线、三种不同的污水质量和成本导向情景。结果显示,基于LCA优化的优化对环境的影响比基线情景要小,因为成本、能源消耗和温室气体排放由于高非线性和差异性较高而降低至0.890 CNY/m3-ww、0.530 kWh/m3-w、2.491 CO2-eq/m3-ww。成本导向性控制战略显示,总体绩效评估性绩效评估的总体绩效评估效果是牺牲环境,但成本较低是0.873 CNYMY/m3w。值得一提的是,成本优化的优化环境环境影响战略的更新程度,而成本变化变化评估是基准情景评估:成本评估模型的进度评估的每个基准评估的进度评估,对10天进行成本变化评估评估评估的进度评估的进度分析,对10年中测测测测值分析,应考虑。