This work proposes a quasirandom sequence of quadratures for high-dimensional mean-field variational inference and a related sparsifying methodology. Each iterate of the sequence contains two evaluations points that combine to correctly integrate all univariate quadratic functions, as well as univariate cubics if the mean-field factors are symmetric. More importantly, averaging results over short subsequences achieves periodic exactness on a much larger space of multivariate polynomials of quadratic total degree. This framework is devised by first considering stochastic blocked mean-field quadratures, which may be useful in other contexts. By replacing pseudorandom sequences with quasirandom sequences, over half of all multivariate quadratic basis functions integrate exactly with only 4 function evaluations, and the exactness dimension increases for longer subsequences. Analysis shows how these efficient integrals characterize the dominant log-posterior contributions to mean-field variational approximations, including diagonal Hessian approximations, to support a robust sparsifying methodology in deep learning algorithms. A numerical demonstration of this approach on a simple Convolutional Neural Network for MNIST retains high test accuracy, 96.9%, while training over 98.9% of parameters to zero in only 10 epochs, bearing potential to reduce both storage and energy requirements for deep learning models.
翻译:这项工作提出一个准随机序列, 用于高维平均场变异性发酵和相关的累进法。 每个序列的迭代都包含两个评价点, 如果平均场因数对称, 可以正确整合所有单亚里亚特二次函数, 以及单异异方立方立方体。 更重要的是, 短次后继平均结果可以定期精确到一个大得多的多变量多数值空间, 即四面形总度。 这个框架的设计首先考虑在其它情况下可能有用的被封闭的中位方二次方位。 通过将假冒序列替换为准随机序列序列, 将所有多变方方立方立方立方函数的一半以上功能完全与4项功能评价相结合, 以及更长期次后继的精确度增加。 分析显示这些高效的整体性是如何将主要日志- 偏差对平均变化近似值( 包括三角海相近称) 定性, 以便支持深度的深度振荡方法, 而在深度学习的零序序列矩阵中, 将这一数值演示方法仅用于98 10MIS 。