We present an algorithm for minimizing an objective with hard-to-compute gradients by using a related, easier-to-access function as a proxy. Our algorithm is based on approximate proximal point iterations on the proxy combined with relatively few stochastic gradients from the objective. When the difference between the objective and the proxy is $\delta$-smooth, our algorithm guarantees convergence at a rate matching stochastic gradient descent on a $\delta$-smooth objective, which can lead to substantially better sample efficiency. Our algorithm has many potential applications in machine learning, and provides a principled means of leveraging synthetic data, physics simulators, mixed public and private data, and more.
翻译:我们提出一种算法,通过使用一个相关的、容易获取的功能作为代理来最大限度地减少一个具有难以计算梯度的目标。我们的算法基于代理人的近似准点迭代,加上目标中相对较少的随机梯度。当目标与代理人的差别是 $\delta$-smooth时,我们的算法保证以与随机梯度梯度下降率相匹配的速率达到一个能大大提高抽样效率的目标。我们的算法在机器学习中有许多潜在应用,并且提供了一种利用合成数据、物理模拟器、混合公共和私人数据以及更多数据的原则性手段。