较小的学生：面向高效图像检索的容量动态蒸馏 (Towards a Smaller Student: Capacity Dynamic Distillation for Efficient Image Retrieval)

Previous Knowledge Distillation based efficient image retrieval methods employs a lightweight network as the student model for fast inference. However, the lightweight student model lacks adequate representation capacity for effective knowledge imitation during the most critical early training period, causing final performance degeneration. To tackle this issue, we propose a Capacity Dynamic Distillation framework, which constructs a student model with editable representation capacity. Specifically, the employed student model is initially a heavy model to fruitfully learn distilled knowledge in the early training epochs, and the student model is gradually compressed during the training. To dynamically adjust the model capacity, our dynamic framework inserts a learnable convolutional layer within each residual block in the student model as the channel importance indicator. The indicator is optimized simultaneously by the image retrieval loss and the compression loss, and a retrieval-guided gradient resetting mechanism is proposed to release the gradient conflict. Extensive experiments show that our method has superior inference speed and accuracy, e.g., on the VeRi-776 dataset, given the ResNet101 as a teacher, our method saves 67.13% model parameters and 65.67% FLOPs (around 24.13% and 21.94% higher than state-of-the-arts) without sacrificing accuracy (around 2.11% mAP higher than state-of-the-arts).

翻译：之前，基于知识蒸馏的高效图像检索方法采用轻量级网络作为快速推理的学生模型。然而，轻量级学生模型在关键的早期训练阶段缺乏足够的表示能力，导致最终性能退化。为了解决这个问题，我们提出了一个容量动态蒸馏框架，它构建了一个具有可编辑表示容量的学生模型。具体来说，所采用的学生模型最初是一个重模型，在早期训练阶段充分学习蒸馏知识，然后在训练过程中逐渐压缩学生模型。为了动态调整模型容量，我们的动态框架在学生模型的每个残差块中插入一个可学习的卷积层作为通道重要性指标。该指标同时被图像检索损失和压缩损失优化，并提出了一种基于检索的梯度重置机制以释放梯度冲突。广泛的实验表明，我们的方法具有卓越的推理速度和准确度。例如，在VeRi-776数据集上，给定ResNet101作为老师，我们的方法节省了67.13%的模型参数和65.67%的FLOPs（约高于现有技术的24.13%和21.94%）而不牺牲准确度（约高于现有技术的2.11% mAP）。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

《机器学习模型中不确定性的量化和推理》CMU2022最新29页slides

专知会员服务

56+阅读 · 2022年11月28日