不当不当 (Being Properly Improper)

In today's ML, data can be twisted (changed) in various ways, either for bad or good intent. Such twisted data challenges the founding theory of properness for supervised losses which form the basis for many popular losses for class probability estimation. Unfortunately, at its core, properness ensures that the optimal models also learn the twist. In this paper, we analyse such class probability-based losses when they are stripped off the mandatory properness; we define twist-proper losses as losses formally able to retrieve the optimum (untwisted) estimate off the twists, and show that a natural extension of a half-century old loss introduced by S. Arimoto is twist proper. We then turn to a theory that has provided some of the best off-the-shelf algorithms for proper losses, boosting. Boosting can require access to the derivative of the convex conjugate of a loss to compute examples weights. Such a function can be hard to get, for computational or mathematical reasons; this turns out to be the case for Arimoto's loss. We bypass this difficulty by inverting the problem as follows: suppose a blueprint boosting algorithm is implemented with a general weight update function. What are the losses for which boosting-compliant minimisation happens? Our answer comes as a general boosting algorithm which meets the optimal boosting dependence on the number of calls to the weak learner; when applied to Arimoto's loss, it leads to a simple optimisation algorithm whose performances are showcased on several domains and twists.

翻译：在今天的 ML 中, 数据可以以各种方式扭曲( 更改), 不管是坏意还是好意。这种扭曲数据会挑战受监督损失的创建理论。这种受监督损失的正确性构成许多流行性损失的基础, 而这种理论是分类概率估计的基础。不幸的是, 在其核心, 正确性能可以确保最佳模型也能够了解扭曲。在本文中, 当它们被从强制的正确性格中剥离出来时, 我们分析这些基于阶级概率的损失; 我们定义扭曲性损失, 将它们定义为能够正式取回最优( 未扭曲的)估计的扭曲性损失, 并表明由S. Arimoto提出的半个世纪的旧损失的自然延伸是正常的。我们绕过这一困难, 将这个半世纪的旧损失推移, 也就是由某些现成的推算法, 推算出一个整型的推算结果, 将一个整型的推算为整型的推算结果。这种推算结果将得出一个整型的推算结果。