We formalize and study the natural approach of designing convex surrogate loss functions via embeddings, for problems such as classification, ranking, or structured prediction. In this approach, one embeds each of the finitely many predictions (e.g.\ rankings) as a point in $\mathbb{R}^d$, assigns the original loss values to these points, and "convexifies" the loss in some way to obtain a surrogate. We establish a strong connection between this approach and polyhedral (piecewise-linear convex) surrogate losses. Given any polyhedral loss $L$, we give a construction of a link function through which $L$ is a consistent surrogate for the loss it embeds. Conversely, we show how to construct a consistent polyhedral surrogate for any given discrete loss. Our framework yields succinct proofs of consistency or inconsistency of various polyhedral surrogates in the literature, and for inconsistent surrogates, it further reveals the discrete losses for which these surrogates are consistent. We show some additional structure of embeddings, such as the equivalence of embedding and matching Bayes risks, and the equivalence of various notions of non-redudancy. Using these results, we establish that indirect elicitation, a necessary condition for consistency, is also sufficient when working with polyhedral surrogates.
翻译:我们正式确定并研究以嵌入方式设计 convex 代谢损失功能的自然方法, 解决分类、 排名或结构化预测等问题。 在这种方法中, 将有限的许多预测( 如排名)中的每一项( 例如排名) 都嵌入为美元中的点, 将原始损失值指定为这些点, 并将损失“ 解密” 以某种方式“ 解密” 获取代谢。 我们在这个方法与多面( 双向线性) 代谢损失之间建立了紧密的联系。 在任何多面损失的情况下, 我们构建了一个链接功能, 将美元作为它嵌入损失的一致代谢。 相反, 我们展示了如何为任何给定的离心损失构建一个一致的多面性替代。 我们的框架提供了文献中各种多面合体代谢的一致性或不一致性的简明证据, 并且对于不一致的代谢性, 它进一步揭示了这些代孕国所遵循的离心损失。 我们展示了某种联系的关联性功能结构, 并且利用了某种必要的代谢性结构, 将各种代谢性等值作为我们不相等值的对比。