We study discrete-time mirror descent applied to the unregularized empirical risk in matrix sensing. In both the general case of rectangular matrices and the particular case of positive semidefinite matrices, a simple potential-based analysis in terms of the Bregman divergence allows us to establish convergence of mirror descent -- with different choices of the mirror maps -- to a matrix that, among all global minimizers of the empirical risk, minimizes a quantity explicitly related to the nuclear norm, the Frobenius norm, and the von Neumann entropy. In both cases, this characterization implies that mirror descent, a first-order algorithm minimizing the unregularized empirical risk, recovers low-rank matrices under the same set of assumptions that are sufficient to guarantee recovery for nuclear-norm minimization. When the sensing matrices are symmetric and commute, we show that gradient descent with full-rank factorized parametrization is a first-order approximation to mirror descent, in which case we obtain an explicit characterization of the implicit bias of gradient flow as a by-product.
翻译:在矩形矩阵和正正半无线矩阵的一般情况中,对布雷格曼差异进行简单的潜在分析,使我们能够通过对镜底底部的不同选择,将镜底底部与一个矩阵相融合 -- -- 对镜底部进行不同的镜像地图作出不同的选择 -- -- 在一个矩阵中,在所有尽量减少实际风险的全球最小化者中,将明显与核规范、Frobenius规范以及von Neumann entropy相关的数量降至最低。在这两种情况下,这种定性都意味着镜底底底部、一种将非正常经验风险降到最低的一级算法、在足以保证恢复核规范最小化的一套假设下回收低级基层。当感光矩阵具有对称性和可移动性时,我们表明,具有全因子化的抛光度梯度梯度的梯度下部系与镜底部的初等近,在这种情况下,我们对梯度流动作为副产品所隐含的偏差作了明确的描述。