We propose a categorical semantics of gradient-based machine learning algorithms in terms of lenses, parametrised maps, and reverse derivative categories. This foundation provides a powerful explanatory and unifying framework: it encompasses a variety of gradient descent algorithms such as ADAM, AdaGrad, and Nesterov momentum, as well as a variety of loss functions such as as MSE and Softmax cross-entropy, shedding new light on their similarities and differences. Our approach to gradient-based learning has examples generalising beyond the familiar continuous domains (modelled in categories of smooth maps) and can be realized in the discrete setting of boolean circuits. Finally, we demonstrate the practical significance of our framework with an implementation in Python.
翻译:我们提出了基于梯度的机器学习算法在镜头、图象和反向衍生物类别方面的绝对语义。这个基础提供了一个强有力的解释和统一框架:它包含各种梯度下降算法,如ADAM、AdaGrad和Nesterov动力,以及各种损失函数,如MSE和Softmax交叉肾上腺,揭示了它们的相似性和差异。我们对基于梯度的学习方法的范例超越了熟悉的连续领域(以光滑的地图类别为模型),可以在布林电路的离散环境中实现。 最后,我们展示了我们框架的实际意义,并在Python实施。