We study a projection-free conditional gradient-type algorithm for constrained nonconvex stochastic optimization problems with Markovian data. In particular, we focus on the case when the transition kernel of the Markov chain is state-dependent. Such stochastic optimization problems arise in various machine learning problems including strategic classification and reinforcement learning. For this problem, we establish that the number of calls to the stochastic first-order oracle and the linear minimization oracle to obtain an appropriately defined $\epsilon$-stationary point, are of the order $\mathcal{O}(1/\epsilon^{2.5})$ and $\mathcal{O}(1/\epsilon^{5.5})$ respectively. We also empirically demonstrate the performance of our algorithm on the problem of strategic classification with neural networks.
翻译:我们用Markovian 数据来研究限制非convex 蒸馏优化问题的不投射性梯度型有条件算法。 特别是, 我们侧重于马尔科夫链的过渡内核取决于国家的情况。 这种随机优化问题分别出现在各种机器学习问题中, 包括战略分类和强化学习。 对于这个问题, 我们确定, 调用随机第一阶和线性最小化线性电算法以获得一个适当定义的美元固定点的次数, 是 $\ mathcal{O} (1/\\ epsilon\\ 2.5}) $ 和 $\ mathcal{O} (1/\\ silon\ ) (1/\ 5.5) 美元 的顺序。 我们还用经验性地展示了我们在神经网络的战略分类问题上的算法表现 。