We construct a zeroth-order gradient estimator for a smooth function defined on the probability simplex. The proposed estimator queries the simplex only. We prove that projected gradient descent and the exponential weights algorithm, when run with this estimator instead of exact gradients, converge at a $\mathcal O(T^{-1/4})$ rate.
翻译:我们为概率简单x上定义的平滑函数构建一个零级梯度估计值。 提议的估计值仅询问简单x。 我们证明, 当使用此估计值而不是精确梯度运行时, 预测的梯度下降和指数加权算法会以$\mathcal O( T ⁇ _ 1/4}) 的汇率趋近 。