We consider the problem of computing reach-avoid probabilities for iterative predictions made with Bayesian neural network (BNN) models. Specifically, we leverage bound propagation techniques and backward recursion to compute lower bounds for the probability that trajectories of the BNN model reach a given set of states while avoiding a set of unsafe states. We use the lower bounds in the context of control and reinforcement learning to provide safety certification for given control policies, as well as to synthesize control policies that improve the certification bounds. On a set of benchmarks, we demonstrate that our framework can be employed to certify policies over BNNs predictions for problems of more than $10$ dimensions, and to effectively synthesize policies that significantly increase the lower bound on the satisfaction probability.
翻译:我们考虑的是计算与贝叶西亚神经网络模型进行迭代预测的概率,避免可能性。具体地说,我们利用捆绑传播技术和后向回溯来计算下限,使BNN模型的轨迹到达特定一组国家的概率降低,同时避免一系列不安全国家。 我们利用控制和加强学习中的下限为特定控制政策提供安全认证,并综合改进认证界限的控制政策。 在一套基准上,我们证明我们的框架可以用来验证有关BNNS预测超过1 000美元维度的问题的政策,并有效地合成大幅提高满意概率下限的政策。