Since the person re-identification task often suffers from the problem of pose changes and occlusions, some attentive local features are often suppressed when training CNNs. In this paper, we propose the Batch DropBlock (BDB) Network which is a two branch network composed of a conventional ResNet-50 as the global branch and a feature dropping branch. The global branch encodes the global salient representations. Meanwhile, the feature dropping branch consists of an attentive feature learning module called Batch DropBlock, which randomly drops the same region of all input feature maps in a batch to reinforce the attentive feature learning of local regions. The network then concatenates features from both branches and provides a more comprehensive and spatially distributed feature representation. Albeit simple, our method achieves state-of-the-art on person re-identification and it is also applicable to general metric learning tasks. For instance, we achieve 76.4% Rank-1 accuracy on the CUHK03-Detect dataset and 83.0% Recall-1 score on the Stanford Online Products dataset, outperforming the existing works by a large margin (more than 6%).
翻译:由于重新确定身份的任务往往受到构成变化和排斥问题的影响,在培训有线电视新闻网时,一些关注的地方性能经常受到压制。在本文中,我们提议Batch dropBlock(BDB)网络,这是一个由传统的ResNet-50组成的两个分支网络,作为全球分支和一个特性下降分支。全球分支编码了全球显著表示。同时,特征下降分支包括一个称为Batch dropBlock的专注特征学习模块,该模块随机地将所有输入特征图的同一区域放在一组中,以加强当地区域的专注特征学习。然后,网络将两个分支的特征集中起来,提供更加全面和空间分布的特征代表。尽管简单,我们的方法在个人重新确定方面达到了最新水平,也适用于一般的计量学习任务。例如,我们实现了CUHK03检测数据集76.4%的精度一级和斯坦福在线产品数据集83.0%的回调1分数,比现有工作高出一个大幅度(超过6%)。