We study stochastic optimization algorithms for constrained nonconvex stochastic optimization problems with Markovian data. In particular, we focus on the case when the transition kernel of the Markov chain is state-dependent. Such stochastic optimization problems arise in various machine learning problems including strategic classification and reinforcement learning. For this problem, we study both projection-based and projection-free algorithms. In both cases, we establish that the number of calls to the stochastic first-order oracle to obtain an appropriately defined $\epsilon$-stationary point is of the order $\mathcal{O}(1/\epsilon^{2.5})$. In the projection-free setting we additionally establish that the number of calls to the linear minimization oracle is of order $\mathcal{O}(1/\epsilon^{5.5})$. We also empirically demonstrate the performance of our algorithm on the problem of strategic classification with neural networks.
翻译:我们用Markovian 数据来研究限制的非convex 蒸汽优化问题的随机优化算法。 特别是, 我们侧重于马尔科夫链的过渡内核是否依赖国家的情况。 这种随机优化问题出现在各种机器学习问题中, 包括战略分类和强化学习。 对于这个问题, 我们既研究基于预测的算法, 也研究无投射的算法。 在这两种情况下, 我们确定调用随机第一阶点获得适当定义的 $\ epsilon$- 静止点的次数是 $\ mathcal{O} (1/\\ epsilon\\ 2.5}) 的顺序。 在无预测的设置中, 我们进一步确定线性最小化或线性电弧的调数为 $\ mathcal{O} (1/\\ epsilon ⁇ 5} 。 我们还用实验性地展示了我们在神经网络的战略分类问题上的算法表现。