遮盖矢量量化 (Masked Vector Quantization) - 专知论文

会员服务 ·

0

向量化 · 可约的 · 掩码 · 词元分析器 · 示例 ·

2023 年 1 月 16 日

Masked Vector Quantization

翻译：遮盖矢量量化

David D. Nguyen,David Leibowitz,Surya Nepal,Salil S. Kanhere

from arxiv, Main paper (10 pages), References (2 pages), Appendix (7 pages)

Generative models with discrete latent representations have recently demonstrated an impressive ability to learn complex high-dimensional data distributions. However, their performance relies on a long sequence of tokens per instance and a large number of codebook entries, resulting in long sampling times and considerable computation to fit the categorical posterior. To address these issues, we propose the Masked Vector Quantization (MVQ) framework which increases the representational capacity of each code vector by learning mask configurations via a stochastic winner-takes-all training regime called Multiple Hypothese Dropout (MH-Dropout). On ImageNet 64$\times$64, MVQ reduces FID in existing vector quantization architectures by up to $68\%$ at 2 tokens per instance and $57\%$ at 5 tokens. These improvements widen as codebook entries is reduced and allows for $7\textit{--}45\times$ speed-up in token sampling during inference. As an additional benefit, we find that smaller latent spaces lead to MVQ identifying transferable visual representations where multiple can be smoothly combined.

翻译：具有离散潜表层的生成模型最近表明,它们具有学习复杂高维数据分布的令人印象深刻的能力,然而,它们的性能取决于一例中一系列的象征性和大量代码簿条目,从而导致长时间的取样时间和大量计算来适应绝对的后背体。为了解决这些问题,我们提议了蒙面矢量量定量化框架,通过一个叫做多孔化赢家取胜全培训制度(MH-Dropout)的学习掩码配置来提高每个代号矢量的代表性能力。在图像Net 64$\time diskout (MH-Dropout)上,MVQ将现有矢量定量结构中的FID减少最多68$(每例2个象征性)和57$(5个符号),这些改进随着代码簿条目的减少而扩大,并允许在推断过程中将7\text{-}45\timetis快速的象征性取样。我们发现,一个额外的好处是,较小的潜在空间导致MVQ确定可转让的视觉图象显示,在多点上可以顺利合并。

0

相关内容

向量化

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

多标签学习的新趋势（2020 Survey）

多标签学习的新趋势（2020 Survey）

专知会员服务

44+阅读 · 2020年12月6日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

樟疫霉致病性相关GPCR-PIPK鉴定与机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

非局部总变差正则化图像恢复模型的快速子空间校正算法

国家自然科学基金

0+阅读 · 2014年12月31日

调控巨噬细胞极化的microRNA分子鉴定及其在CVB3诱导的急性病毒性心肌炎中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

EV71病毒非结构蛋白2A的功能及作用机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

新型核苷的分子设计、合成和抗乙肝病毒的活性研究

国家自然科学基金

0+阅读 · 2011年12月31日

Adiponectin在肝脏缺血再灌注损伤中的抗肝细胞凋亡机制

国家自然科学基金

0+阅读 · 2009年12月31日

多酸-配位聚合物互为主客体的结构化学

国家自然科学基金

0+阅读 · 2008年12月31日

QVRF: A Quantization-error-aware Variable Rate Framework for Learned Image Compression

Arxiv

0+阅读 · 2023年3月10日

Mimic before Reconstruct: Enhancing Masked Autoencoders with Feature Mimicking

Arxiv

0+阅读 · 2023年3月9日

DREAM: Efficient Dataset Distillation by Representative Matching

Arxiv

0+阅读 · 2023年3月9日

Masked Autoencoder for Self-Supervised Pre-training on Lidar Point Clouds

Arxiv

0+阅读 · 2023年3月9日

Masked Image Modeling with Local Multi-Scale Reconstruction

Arxiv

0+阅读 · 2023年3月9日

SE(3)-DiffusionFields: Learning smooth cost functions for joint grasp and motion optimization through diffusion

Arxiv

0+阅读 · 2023年3月9日

A Light Weight Model for Active Speaker Detection

Arxiv

0+阅读 · 2023年3月8日

Neural Vector Fields: Implicit Representation by Explicit Learning

Arxiv

0+阅读 · 2023年3月8日

Masked Autoencoders Are Scalable Vision Learners

Arxiv

27+阅读 · 2021年11月11日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

VIP会员

文章信息

相关主题

词元分析器

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

多标签学习的新趋势（2020 Survey）

多标签学习的新趋势（2020 Survey）

专知会员服务

44+阅读 · 2020年12月6日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

相关论文

QVRF: A Quantization-error-aware Variable Rate Framework for Learned Image Compression

Arxiv

0+阅读 · 2023年3月10日

Mimic before Reconstruct: Enhancing Masked Autoencoders with Feature Mimicking

Arxiv

0+阅读 · 2023年3月9日

DREAM: Efficient Dataset Distillation by Representative Matching

Arxiv

0+阅读 · 2023年3月9日

Masked Autoencoder for Self-Supervised Pre-training on Lidar Point Clouds

Arxiv

0+阅读 · 2023年3月9日

Masked Image Modeling with Local Multi-Scale Reconstruction

Arxiv

0+阅读 · 2023年3月9日

SE(3)-DiffusionFields: Learning smooth cost functions for joint grasp and motion optimization through diffusion

Arxiv

0+阅读 · 2023年3月9日

A Light Weight Model for Active Speaker Detection

Arxiv

0+阅读 · 2023年3月8日

Neural Vector Fields: Implicit Representation by Explicit Learning

Arxiv

0+阅读 · 2023年3月8日

Masked Autoencoders Are Scalable Vision Learners

Arxiv

27+阅读 · 2021年11月11日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

相关基金

樟疫霉致病性相关GPCR-PIPK鉴定与机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

非局部总变差正则化图像恢复模型的快速子空间校正算法

国家自然科学基金

0+阅读 · 2014年12月31日

调控巨噬细胞极化的microRNA分子鉴定及其在CVB3诱导的急性病毒性心肌炎中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

EV71病毒非结构蛋白2A的功能及作用机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

新型核苷的分子设计、合成和抗乙肝病毒的活性研究

国家自然科学基金

0+阅读 · 2011年12月31日

Adiponectin在肝脏缺血再灌注损伤中的抗肝细胞凋亡机制

国家自然科学基金

0+阅读 · 2009年12月31日

多酸-配位聚合物互为主客体的结构化学

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员