This paper proposes a sparse Bayesian treatment of deep neural networks (DNNs) for system identification. Although DNNs show impressive approximation ability in various fields, several challenges still exist for system identification problems. First, DNNs are known to be too complex that they can easily overfit the training data. Second, the selection of the input regressors for system identification is nontrivial. Third, uncertainty quantification of the model parameters and predictions are necessary. The proposed Bayesian approach offers a principled way to alleviate the above challenges by marginal likelihood/model evidence approximation and structured group sparsity-inducing priors construction. The identification algorithm is derived as an iterative regularised optimisation procedure that can be solved as efficiently as training typical DNNs. Remarkably, an efficient and recursive Hessian calculation method for each layer of DNNs is developed, turning the intractable training/optimisation process into a tractable one. Furthermore, a practical calculation approach based on the Monte-Carlo integration method is derived to quantify the uncertainty of the parameters and predictions. The effectiveness of the proposed Bayesian approach is demonstrated on several linear and nonlinear system identification benchmarks by achieving good and competitive simulation accuracy. The code to reproduce the experimental results is open-sourced and available online.
翻译:本文建议对深神经网络(DNN)进行稀有的巴伊西亚处理,以便进行系统识别。虽然DNN在各个领域表现出令人印象深刻的近似能力,但在系统识别问题上仍然存在若干挑战。首先,DNN被认为过于复杂,因此可以轻易地过度配置培训数据。第二,为系统识别选择输入递减器是非边际的。第三,模型参数和预测的不确定性量化是必要的。拟议的Bayesian方法提供了一种原则性方法,通过边际可能性/模范证据近似和结构化的组群覆盖来减轻上述挑战。识别算法是作为一种迭接的常规优化程序产生的,可以像培训典型DNNN那样有效地解决。值得注意的是,为DNN每层开发一种高效和循环的赫萨的计算方法,将棘手的培训/优化的模型参数和预测程序转变为可牵引力的。此外,基于蒙特-卡洛集集法的实用计算方法可以量化参数和预测的不确定性。拟议的Bayesian方法的有效性是作为迭接合的迭接式常规和无线的在线模拟结果,通过若干可操作的线和可操作的试验性的模拟和可实现的在线模拟的模拟结果。