Recent work has shown that automatic differentiation over the reals is almost always correct in a mathematically precise sense. However, actual programs work with machine-representable numbers (e.g., floating-point numbers), not reals. In this paper, we study the correctness of automatic differentiation when the parameter space of a neural network consists solely of machine-representable numbers. For a neural network with bias parameters, we prove that automatic differentiation is correct at all parameters where the network is differentiable. In contrast, it is incorrect at all parameters where the network is non-differentiable, since it never informs non-differentiability. To better understand this non-differentiable set of parameters, we prove a tight bound on its size, which is linear in the number of non-differentiabilities in activation functions, and provide a simple necessary and sufficient condition for a parameter to be in this set. We further prove that automatic differentiation always computes a Clarke subderivative, even on the non-differentiable set. We also extend these results to neural networks possibly without bias parameters.
翻译:最近的工作表明,在数学精确的意义上,对真实的自动区分几乎总是正确。 但是,实际程序使用机器代表数字(例如浮点数)而不是真实数字。 在本文中, 当神经网络的参数空间完全由机器代表数字组成时, 我们研究自动区分是否正确。 对于带有偏差参数的神经网络来说, 我们证明自动区分在所有参数中都是正确的, 而相反, 在网络不可区分的所有参数中, 它都是不正确的, 因为网络从未告知不可区分的参数。 为了更好地了解这个不可区分的参数组, 我们证明它的规模是紧密的, 因为它在激活功能中的非区别性数量上是线性, 并且为在这个设置的参数提供了简单、 必要和充分的条件。 我们还进一步证明, 自动区分总是计算一个晶子子子的分义, 即使是在不可区分的数据集上。 我们还将这些结果扩大到可能没有偏差参数的神经网络。