Scalability and accuracy are well recognized challenges in deep extreme multi-label learning where the objective is to train architectures for automatically annotating a data point with the most relevant subset of labels from an extremely large label set. This paper develops the DeepXML framework that addresses these challenges by decomposing the deep extreme multi-label task into four simpler sub-tasks each of which can be trained accurately and efficiently. Choosing different components for the four sub-tasks allows DeepXML to generate a family of algorithms with varying trade-offs between accuracy and scalability. In particular, DeepXML yields the Astec algorithm that could be 2-12% more accurate and 5-30x faster to train than leading deep extreme classifiers on publically available short text datasets. Astec could also efficiently train on Bing short text datasets containing up to 62 million labels while making predictions for billions of users and data points per day on commodity hardware. This allowed Astec to be deployed on the Bing search engine for a number of short text applications ranging from matching user queries to advertiser bid phrases to showing personalized ads where it yielded significant gains in click-through-rates, coverage, revenue and other online metrics over state-of-the-art techniques currently in production. DeepXML's code is available at https://github.com/Extreme-classification/deepxml
翻译:在深极多标签学习中,可测量性和准确性是公认的挑战。 深极多标签学习的目标是通过极大标签集,对结构结构进行自动说明,从一个极大标签集中以最贴切的标签子集,对数据点进行自动说明。 本文开发了深海XML框架,通过将深极多标签任务分解成四个更简单的子任务来应对这些挑战,每个任务都可以得到准确和有效的培训。 为四个子任务选择不同的组件,使DeepXML能够产生一系列算法,在准确性和可缩放之间取舍取舍不一不一。 特别是, DeepXML 生成了Astec 算法,该算法可能更准确2-12 % 和 5-30x,比在公共可用的短文本数据集中领导深度极端分类者培训速度更快。 Astec还可以高效地将包含多达6 200万个标签的短文本数据集分解,同时对商品硬件的数十亿用户和数据点进行预测。 这允许将Astec部署到Bing搜索引擎, 用于一系列短文本应用程序应用程序应用程序应用程序,从匹配用户查询到广告标点标/深度标/深度标/深度标, 并显示目前个人- 版版版的打印版版的打印版的打印版的打印版的硬质数据, 将获得重大收益。