Learned representations are a central component in modern ML systems, serving a multitude of downstream tasks. When training such representations, it is often the case that computational and statistical constraints for each downstream task are unknown. In this context rigid, fixed capacity representations can be either over or under-accommodating to the task at hand. This leads us to ask: can we design a flexible representation that can adapt to multiple downstream tasks with varying computational resources? Our main contribution is Matryoshka Representation Learning (MRL) which encodes information at different granularities and allows a single embedding to adapt to the computational constraints of downstream tasks. MRL minimally modifies existing representation learning pipelines and imposes no additional cost during inference and deployment. MRL learns coarse-to-fine representations that are at least as accurate and rich as independently trained low-dimensional representations. The flexibility within the learned Matryoshka Representations offer: (a) up to 14x smaller embedding size for ImageNet-1K classification at the same level of accuracy; (b) up to 14x real-world speed-ups for large-scale retrieval on ImageNet-1K and 4K; and (c) up to 2% accuracy improvements for long-tail few-shot classification, all while being as robust as the original representations. Finally, we show that MRL extends seamlessly to web-scale datasets (ImageNet, JFT) across various modalities -- vision (ViT, ResNet), vision + language (ALIGN) and language (BERT). MRL code and pretrained models are open-sourced at https://github.com/RAIVNLab/MRL.
翻译:学习表现是现代 ML 系统中的一个核心组成部分, 用于多种下游任务。 当培训这种表现时, 常常会出现以下情况: 每一个下游任务的计算和统计限制未知。 在这方面, 固定的能力表现可能超过或低于手头的任务。 这导致我们问: 我们能否设计一个灵活的代表形式, 能够适应多种下游任务, 使用不同的计算资源? 我们的主要贡献是 Matryoshka 代表学习 (MRL ), 它将信息编码于不同的颗粒, 并允许一次性嵌入以适应下游任务的计算限制 。 MRL 最小修改现有的代表学习管道, 并且在推断和部署期间不增加成本。 MRL 学会与独立训练的低维度表现一样准确和丰富。 学习过的 Matryoshka 代表形式提供:(a) 将图像Net-1K 分类的嵌入范围缩小到14x, 同一程度的图像- k 的分类, 将实际速度提高到14x, 在图像- 网络 网络 版本上进行大规模检索, 最后的 RL- 和 RB 整个 格式 数据, 和 RV- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- salvial- salvial- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- s