In the context of modern environmental and societal concerns, there is an increasing demand for methods able to identify management strategies for civil engineering systems, minimizing structural failure risks while optimally planning inspection and maintenance (I&M) processes. Most available methods simplify the I&M decision problem to the component level due to the computational complexity associated with global optimization methodologies under joint system-level state descriptions. In this paper, we propose an efficient algorithmic framework for inference and decision-making under uncertainty for engineering systems exposed to deteriorating environments, providing optimal management strategies directly at the system level. In our approach, the decision problem is formulated as a factored partially observable Markov decision process, whose dynamics are encoded in Bayesian network conditional structures. The methodology can handle environments under equal or general, unequal deterioration correlations among components, through Gaussian hierarchical structures and dynamic Bayesian networks. In terms of policy optimization, we adopt a deep decentralized multi-agent actor-critic (DDMAC) reinforcement learning approach, in which the policies are approximated by actor neural networks guided by a critic network. By including deterioration dependence in the simulated environment, and by formulating the cost model at the system level, DDMAC policies intrinsically consider the underlying system-effects. This is demonstrated through numerical experiments conducted for both a 9-out-of-10 system and a steel frame under fatigue deterioration. Results demonstrate that DDMAC policies offer substantial benefits when compared to state-of-the-art heuristic approaches. The inherent consideration of system-effects by DDMAC strategies is also interpreted based on the learned policies.
翻译:在现代环境和社会关切的背景下,对能够确定土木工程系统管理战略、尽量减少结构性故障风险、同时优化规划检查和维护(I&M)流程的方法的需求日益增长。由于在系统级联合州级描述下全球优化方法的计算复杂性,大多数可用方法将I&M决策问题简化到组成部分一级。在本文件中,我们提出了一个高效的算法框架,用于在面临不断恶化环境的工程系统的不确定性下进行推论和决策,直接在系统一级提供最佳管理战略。在我们的方法中,决策问题是作为一个部分可观测到的Markov决策过程,其动态在Bayesian网络有条件的结构中编码。该方法可以通过高斯级结构以及动态Bayesian网络,处理各组成部分之间在平等或普遍、不均匀的恶化相关环境。在政策优化方面,我们采用了一种高度分散的多试剂行为者强化学习方法,其中,由以批评网络为指导的演员神经系统政策相近似。通过模拟环境的恶化依赖性环境,并通过在模拟的钢质系统中制定成本-MARC模型,在模拟的系统下,通过模拟的模型分析其模拟的系统下,也考虑一个模拟的数值-DDF-DF-AF-A-A-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-