Randomized trace estimation is a popular and well studied technique that approximates the trace of a large-scale matrix $B$ by computing the average of $x^T Bx$ for many samples of a random vector $X$. Often, $B$ is symmetric positive definite (SPD) but a number of applications give rise to indefinite $B$. Most notably, this is the case for log-determinant estimation, a task that features prominently in statistical learning, for instance in maximum likelihood estimation for Gaussian process regression. The analysis of randomized trace estimates, including tail bounds, has mostly focused on the SPD case. In this work, we derive new tail bounds for randomized trace estimates applied to indefinite $B$ with Rademacher or Gaussian random vectors. These bounds significantly improve existing results for indefinite $B$, reducing the the number of required samples by a factor $n$ or even more, where $n$ is the size of $B$. Even for an SPD matrix, our work improves an existing result by Roosta-Khorasani and Ascher for Rademacher vectors. This work also analyzes the combination of randomized trace estimates with the Lanczos method for approximating the trace of $f(A)$. Particular attention is paid to the matrix logarithm, which is needed for log-determinant estimation. We improve and extend an existing result, to not only cover Rademacher but also Gaussian random vectors.
翻译:随机矢量(X$)的样本中,美元平均值为美元=T Bx美元。通常,美元是正对正数确定值(SPD),但一些应用则产生无限期的B美元。最明显的是,日志确定估算就是这种情况,这是统计学习中突出的一项任务,例如,对高萨进程回归的最大可能性估算。对随机跟踪估算的分析,包括尾圈,主要侧重于SPD案例。在这项工作中,我们还为与Rademacher或Gausian随机矢量(SPD)一起适用于无限期B美元(SPD)的随机跟踪估算得出新的尾线。这些捆绑大大改进了目前对不定期B美元(SPD)的计算结果,将所需样本数量减少为美元或甚至更多,而美元是随机估算值为美元。我们的工作改进了Rosta-Khorasani和Ascher的当前结果,对拉德马赫克矢量(Ladma)的 Oral-arrial-al-ardera 的估算也是需要的。