用于强力分类的金枪鱼流失功能:校准、景观和一般化 (A Tunable Loss Function for Robust Classification: Calibration, Landscape, and Generalization)

We introduce a tunable loss function called $\alpha$-loss, parameterized by $\alpha \in (0,\infty]$, which interpolates between the exponential loss ($\alpha = 1/2$), the log-loss ($\alpha = 1$), and the 0-1 loss ($\alpha = \infty$), for the machine learning setting of classification. Theoretically, we illustrate a fundamental connection between $\alpha$-loss and Arimoto conditional entropy, verify the classification-calibration of $\alpha$-loss in order to demonstrate asymptotic optimality via Rademacher complexity generalization techniques, and build-upon a notion called strictly local quasi-convexity in order to quantitatively characterize the optimization landscape of $\alpha$-loss. Practically, we perform class imbalance, robustness, and classification experiments on benchmark image datasets using convolutional-neural-networks. Our main practical conclusion is that certain tasks may benefit from tuning $\alpha$-loss away from log-loss ($\alpha = 1$), and to this end we provide simple heuristics for the practitioner. In particular, navigating the $\alpha$ hyperparameter can readily provide superior model robustness to label flips ($\alpha > 1$) and sensitivity to imbalanced classes ($\alpha < 1$).

翻译：我们引入了一个叫做 $ alpha$ 损失的金枪鱼损失函数, 以 $alpha = in ( 0,\ infty) 为参数, 在指数损失 (\ alpha = 1/2美元)、日志损失 (\ alpha = 1美元) 和 0-1 损失 (\ alpha = 美元) 之间, 我们引入了一种叫作 $ alpha $ 损失和 Arimoto 有条件的 entropy 之间的基本联系。我们从理论上展示了 $ alpha 和 Arimotomoto 有条件的 entropy 之间的基本联系。我们的主要实际结论是, 将 $ alpha $ 的敏感度校正调整, 以通过 Rademacher 复杂程度的概括技术来显示无症状的最佳性; 构建一个称为严格当地准准的理念, 以量化 $ alpha- ftyle = 1, 我们在基准图像数据集上进行的分类实验。

相关内容

损失函数（机器学习）

关注 10

损失函数，在AI中亦称呼距离函数，度量函数。此处的距离代表的是抽象性的，代表真实数据与预测数据之间的误差。损失函数（loss function）是用来估量你模型的预测值f(x)与真实值Y的不一致程度，它是一个非负实值函数,通常使用L(Y, f(x))来表示，损失函数越小，模型的鲁棒性就越好。损失函数是经验风险函数的核心部分，也是结构风险函数重要组成部分。

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

专知会员服务

13+阅读 · 2019年11月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日