This paper presents a novel deep learning framework for solving multiple optimal stopping problems in high dimensions. While deep learning has recently shown promise for single stopping problems, the multiple exercise case involves complex recursive dependencies that remain challenging. We address this by combining the Dynamic Programming Principle with neural network approximation of the value function. Unlike policy-search methods, our algorithm explicitly learns the value surface. We first consider the discrete-time problem and analyze neural network training error. We then turn to continuous problems and analyze the additional error due to the discretization of the underlying stochastic processes. Numerical experiments on high-dimensional American basket options and nonlinear utility maximization demonstrate that our method provides an efficient and scalable method for the multiple optimal stopping problem.
翻译:本文提出了一种新颖的深度学习框架,用于求解高维空间中的多重最优停止问题。尽管深度学习近期在单次停止问题上已展现出潜力,但多次行权情形涉及复杂的递归依赖关系,至今仍具挑战性。我们通过将动态规划原理与价值函数的神经网络近似相结合来解决这一问题。与策略搜索方法不同,我们的算法显式地学习价值曲面。我们首先考虑离散时间问题,并分析神经网络训练误差。随后转向连续问题,分析由底层随机过程离散化引入的额外误差。针对高维美式篮子期权和非线性效用最大化的数值实验表明,本方法为多重最优停止问题提供了一种高效且可扩展的求解方案。