We characterize the Schr\"odinger bridge problems by a family of Mckean-Vlasov stochastic control problems with no terminal time distribution constraint. In doing so, we use the theory of Hilbert space embeddings of probability measures and then describe the constraint as penalty terms defined by the maximum mean discrepancy in the control problems. A sequence of the probability laws of the state processes resulting from $\epsilon$-optimal controls converges to a unique solution of the Schr\"odinger's problem under mild conditions on given initial and terminal time distributions and an underlying diffusion process. We propose a neural SDE based deep learning algorithm for the Mckean-Vlasov stochastic control problems. Several numerical experiments validate our methods.
翻译:暂无翻译