Extreme multi-label classification (XML) involves tagging a data point with its most relevant subset of labels from an extremely large label set, with several applications such as product-to-product recommendation with millions of products. Although leading XML algorithms scale to millions of labels, they largely ignore label meta-data such as textual descriptions of the labels. On the other hand, classical techniques that can utilize label metadata via representation learning using deep networks struggle in extreme settings. This paper develops the DECAF algorithm that addresses these challenges by learning models enriched by label metadata that jointly learn model parameters and feature representations using deep networks and offer accurate classification at the scale of millions of labels. DECAF makes specific contributions to model architecture design, initialization, and training, enabling it to offer up to 2-6% more accurate prediction than leading extreme classifiers on publicly available benchmark product-to-product recommendation datasets, such as LF-AmazonTitles-1.3M. At the same time, DECAF was found to be up to 22x faster at inference than leading deep extreme classifiers, which makes it suitable for real-time applications that require predictions within a few milliseconds. The code for DECAF is available at the following URL https://github.com/Extreme-classification/DECAF.
翻译:极端多标签分类(XML) 涉及将一个数据点与一个极大标签组最相关的标签子组标记成一个数据点,从一个极为庞大的标签组中贴出一个最相关的标签子组,并采用数以百万计的产品对产品的建议建议等若干应用程序。虽然XML算法在成百万个标签中具有领先的XML算法,但它们基本上忽略了标签元数据,例如对标签的文本描述等。另一方面,传统技术可以通过在极端环境下利用深层次网络的挣扎进行代表性学习来利用标签元数据来利用标签元数据来应对这些挑战。 本文开发了DEAF算法,通过标签元数据来充实模型来应对这些挑战,这些元数据通过深层网络共同学习模型参数和特征说明,并以百万个标签的规模提供准确的分类。DECAF对模型的设计、初始化和培训作出了具体贡献,使其提供比在公开的基底产品建议数据集(如LF-AM-Azontless-1.3M.)上的极端分类师提供高达22x的准确的预测。