We propose a general framework for machine learning based optimization under uncertainty. Our approach replaces the complex forward model by a surrogate, e.g., a neural network, which is learned simultaneously in a one-shot sense when solving the optimal control problem. Our approach relies on a reformulation of the problem as a penalized empirical risk minimization problem for which we provide a consistency analysis in terms of large data and increasing penalty parameter. To solve the resulting problem, we suggest a stochastic gradient method with adaptive control of the penalty parameter and prove convergence under suitable assumptions on the surrogate model. Numerical experiments illustrate the results for linear and nonlinear surrogate models.
翻译:我们提出了在不确定情况下以机器学习为基础的优化总框架。我们的方法用替代模型取代复杂的前方模型,例如神经网络,在解决最佳控制问题时,这种网络是一次性地同时学习的。我们的方法依赖于将问题重新定位为惩罚性的实验风险最小化问题,为此,我们从大量数据和增加刑罚参数的角度提供了一致分析。为了解决由此产生的问题,我们建议采用随机梯度方法,对刑罚参数进行适应性控制,并证明在替代模型的适当假设下趋于一致。数字实验说明了线性和非线性替代模型的结果。