Model reparametrization -- transforming the parameter space via a bijective differentiable map -- is a popular way to improve the training of neural networks. But reparametrizations have also been problematic since they induce inconsistencies in, e.g., Hessian-based flatness measures, optimization trajectories, and modes of probability density functions. This complicates downstream analyses, e.g. one cannot make a definitive statement about the connection between flatness and generalization. In this work, we study the invariance quantities of neural nets under reparametrization from the perspective of Riemannian geometry. We show that this notion of invariance is an inherent property of any neural net, as long as one acknowledges the assumptions about the metric that is always present, albeit often implicitly, and uses the correct transformation rules under reparametrization. We present discussions on measuring the flatness of minima, in optimization, and in probability-density maximization, along with applications in studying the biases of optimizers and in Bayesian inference.
翻译:模型再平衡化 -- -- 通过双向不同地图改变参数空间 -- -- 是改进神经网络培训的流行方法。但再平衡化也是个问题,因为它们在以海珊为基地的平坦度量、优化轨迹和概率密度功能模式等方面引起不一致。这使下游分析复杂化,例如,无法对平坦度和一般化之间的联系作出明确陈述。在这项工作中,我们从里伊曼几何学的角度研究在再平衡状态下神经网的不稳定性数量。我们表明,这种逆差概念是任何神经网的固有属性,只要人们承认对始终存在的标准所作的假设,尽管这种假设往往是隐含的,并在再平衡下使用正确的转换规则。我们讨论计量微粒的平坦度、优化和概率密度最大化,同时研究优化和贝耶斯推断的偏向性。