Face representation learning using datasets with a massive number of identities requires appropriate training methods. Softmax-based approach, currently the state-of-the-art in face recognition, in its usual "full softmax" form is not suitable for datasets with millions of persons. Several methods, based on the "sampled softmax" approach, were proposed to remove this limitation. These methods, however, have a set of disadvantages. One of them is a problem of "prototype obsolescence": classifier weights (prototypes) of the rarely sampled classes receive too scarce gradients and become outdated and detached from the current encoder state, resulting in incorrect training signals. This problem is especially serious in ultra-large-scale datasets. In this paper, we propose a novel face representation learning model called Prototype Memory, which alleviates this problem and allows training on a dataset of any size. Prototype Memory consists of the limited-size memory module for storing recent class prototypes and employs a set of algorithms to update it in appropriate way. New class prototypes are generated on the fly using exemplar embeddings in the current mini-batch. These prototypes are enqueued to the memory and used in a role of classifier weights for softmax classification-based training. To prevent obsolescence and keep the memory in close connection with the encoder, prototypes are regularly refreshed, and oldest ones are dequeued and disposed of. Prototype Memory is computationally efficient and independent of dataset size. It can be used with various loss functions, hard example mining algorithms and encoder architectures. We prove the effectiveness of the proposed model by extensive experiments on popular face recognition benchmarks.
翻译:使用具有大量身份的数据集进行面部代表式学习需要适当的培训方法。 软模法目前是最先进的面部识别法, 其通常的“ 完全软模” 形式, 不适合有数百万人的数据集。 根据“ 抽样软模” 方法, 提出了几种方法来消除这一限制。 但是, 这些方法有一套缺点。 其中之一是“ 原型过时” 问题 : 很少抽样的类类的分类器重量( 原型) : 梯度太稀少, 过时, 从当前编码器状态中分离出来, 导致培训信号不正确。 这个问题在超大型数据集中特别严重 。 在本文中, 我们提议采用一种新的面部显示学习模型, 以缓解这一问题, 并允许就任何大小的数据集进行培训。 原型内存是用于存储最近类原型的有限程度的内存模块, 并使用一套独立算法来更新它。 新类的原型原型模型是使用缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩的缩缩缩缩缩缩缩缩缩图, 。