用于网络规模应用的基于二进制代码的 Hash 嵌入嵌入 (Binary Code based Hash Embedding for Web-scale Applications) - 专知论文

会员服务 ·

0

可约的 · binary · 哈希学习 · 学成 · Performance ·

2021 年 8 月 24 日

Binary Code based Hash Embedding for Web-scale Applications

翻译：用于网络规模应用的基于二进制代码的 Hash 嵌入嵌入

Bencheng Yan,Pengjie Wang,Jinquan Liu,Wei Lin,Kuang-Chih Lee,Jian Xu,Bo Zheng

from arxiv, CIKM 2021, 5 pages; The first two authors contributed equally to this work

Nowadays, deep learning models are widely adopted in web-scale applications such as recommender systems, and online advertising. In these applications, embedding learning of categorical features is crucial to the success of deep learning models. In these models, a standard method is that each categorical feature value is assigned a unique embedding vector which can be learned and optimized. Although this method can well capture the characteristics of the categorical features and promise good performance, it can incur a huge memory cost to store the embedding table, especially for those web-scale applications. Such a huge memory cost significantly holds back the effectiveness and usability of EDRMs. In this paper, we propose a binary code based hash embedding method which allows the size of the embedding table to be reduced in arbitrary scale without compromising too much performance. Experimental evaluation results show that one can still achieve 99\% performance even if the embedding table size is reduced 1000$\times$ smaller than the original one with our proposed method.

翻译：目前,深层次学习模式被广泛采用于推荐人系统和在线广告等网络规模的应用中。在这些应用中,嵌入绝对特征对于深层学习模式的成功至关重要。在这些模型中,标准的方法是给每个绝对特征值指定一个独特的嵌入矢量,可以学习和优化。虽然这种方法可以很好地捕捉绝对特征的特征,并有望取得良好的业绩,但存储嵌入表,特别是这些网络规模的应用程序,可能会产生巨大的记忆成本。这样的巨大的记忆成本大大地抑制了 EDRMs 的有效性和可用性。在本文中,我们提出了一个基于二元代码的 Hash 嵌入方法,允许任意缩小嵌入表的大小,同时不影响太多的性能。实验性评估结果显示,即使嵌入表的大小缩小了1000美元\ 时间,但人们仍然可以达到99 ⁇ 的性能。

0

相关内容

可约的

【KDD2021】检索交互机的表格数据预测

专知会员服务

16+阅读 · 2021年8月13日

面向大数据存储的大型元数据服务器的研究，A Survey on Large Scale Metadata Server for Big Data Storage

面向大数据存储的大型元数据服务器的研究，A Survey on Large Scale Metadata Server for Big Data Storage

专知会员服务

9+阅读 · 2020年5月15日

深度学习搜索，Exploring Deep Learning for Search

深度学习搜索，Exploring Deep Learning for Search

专知会员服务

61+阅读 · 2020年5月9日

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

专知会员服务

42+阅读 · 2020年3月17日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

15+阅读 · 2020年3月7日

【斯坦福大学】Gradient Surgery for Multi-Task Learning

【斯坦福大学】Gradient Surgery for Multi-Task Learning

专知会员服务

47+阅读 · 2020年1月23日

【CIKM 2019论文】哈希图卷积在节点分类中的应用（Hashing Graph Convolution for Node Classification），崔振

【CIKM 2019论文】哈希图卷积在节点分类中的应用（Hashing Graph Convolution for Node Classification），崔振

专知会员服务

24+阅读 · 2019年11月20日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

开源TF-Ranking可扩展库，支持多种排序学习

开源TF-Ranking可扩展库，支持多种排序学习

机器学习算法与Python学习

3+阅读 · 2018年12月20日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新五篇图像分割相关论文—R2U-Net、ScatterNet混合深度学习、分离卷积编解码、控制、Embedding

【论文推荐】最新五篇图像分割相关论文—R2U-Net、ScatterNet混合深度学习、分离卷积编解码、控制、Embedding

专知

7+阅读 · 2018年2月26日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Hyperparameter Selection for Imitation Learning

Arxiv

7+阅读 · 2021年5月25日

A Survey on Multi-Task Learning

Arxiv

31+阅读 · 2021年3月29日

Deep Semantic Dictionary Learning for Multi-label Image Classification

Arxiv

7+阅读 · 2020年12月23日

A survey on deep hashing for image retrieval

A survey on deep hashing for image retrieval

Arxiv

15+阅读 · 2020年6月10日

Collaborative Similarity Embedding for Recommender Systems

Arxiv

8+阅读 · 2019年2月19日

Binarized Knowledge Graph Embeddings

Arxiv

4+阅读 · 2019年2月8日

NSCaching: Simple and Efficient Negative Sampling for Knowledge Graph Embedding

NSCaching: Simple and Efficient Negative Sampling for Knowledge Graph Embedding

Arxiv

7+阅读 · 2019年1月18日

Learning Tree-based Deep Model for Recommender Systems

Arxiv

3+阅读 · 2018年12月21日

Graph Convolutional Neural Networks for Web-Scale Recommender Systems

Arxiv

14+阅读 · 2018年6月6日

Billion-scale Commodity Embedding for E-commerce Recommendation in Alibaba

Arxiv

15+阅读 · 2018年5月24日

VIP会员

文章信息

相关主题

相关VIP内容

【KDD2021】检索交互机的表格数据预测

专知会员服务

16+阅读 · 2021年8月13日

面向大数据存储的大型元数据服务器的研究，A Survey on Large Scale Metadata Server for Big Data Storage

面向大数据存储的大型元数据服务器的研究，A Survey on Large Scale Metadata Server for Big Data Storage

专知会员服务

9+阅读 · 2020年5月15日

深度学习搜索，Exploring Deep Learning for Search

深度学习搜索，Exploring Deep Learning for Search

专知会员服务

61+阅读 · 2020年5月9日

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

专知会员服务

42+阅读 · 2020年3月17日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

15+阅读 · 2020年3月7日

【斯坦福大学】Gradient Surgery for Multi-Task Learning

【斯坦福大学】Gradient Surgery for Multi-Task Learning

专知会员服务

47+阅读 · 2020年1月23日

【CIKM 2019论文】哈希图卷积在节点分类中的应用（Hashing Graph Convolution for Node Classification），崔振

【CIKM 2019论文】哈希图卷积在节点分类中的应用（Hashing Graph Convolution for Node Classification），崔振

专知会员服务

24+阅读 · 2019年11月20日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

开源TF-Ranking可扩展库，支持多种排序学习

开源TF-Ranking可扩展库，支持多种排序学习

机器学习算法与Python学习

3+阅读 · 2018年12月20日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新五篇图像分割相关论文—R2U-Net、ScatterNet混合深度学习、分离卷积编解码、控制、Embedding

【论文推荐】最新五篇图像分割相关论文—R2U-Net、ScatterNet混合深度学习、分离卷积编解码、控制、Embedding

专知

7+阅读 · 2018年2月26日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Hyperparameter Selection for Imitation Learning

Arxiv

7+阅读 · 2021年5月25日

A Survey on Multi-Task Learning

Arxiv

31+阅读 · 2021年3月29日

Deep Semantic Dictionary Learning for Multi-label Image Classification

Arxiv

7+阅读 · 2020年12月23日

A survey on deep hashing for image retrieval

A survey on deep hashing for image retrieval

Arxiv

15+阅读 · 2020年6月10日

Collaborative Similarity Embedding for Recommender Systems

Arxiv

8+阅读 · 2019年2月19日

Binarized Knowledge Graph Embeddings

Arxiv

4+阅读 · 2019年2月8日

NSCaching: Simple and Efficient Negative Sampling for Knowledge Graph Embedding

NSCaching: Simple and Efficient Negative Sampling for Knowledge Graph Embedding

Arxiv

7+阅读 · 2019年1月18日

Learning Tree-based Deep Model for Recommender Systems

Arxiv

3+阅读 · 2018年12月21日

Graph Convolutional Neural Networks for Web-Scale Recommender Systems

Arxiv

14+阅读 · 2018年6月6日

Billion-scale Commodity Embedding for E-commerce Recommendation in Alibaba

Arxiv

15+阅读 · 2018年5月24日

微信扫码咨询专知VIP会员