Typical methods for pedestrian detection focus on either tackling mutual occlusions between crowded pedestrians, or dealing with the various scales of pedestrians. Detecting pedestrians with substantial appearance diversities such as different pedestrian silhouettes, different viewpoints or different dressing, remains a crucial challenge. Instead of learning each of these diverse pedestrian appearance features individually as most existing methods do, we propose to perform contrastive learning to guide the feature learning in such a way that the semantic distance between pedestrians with different appearances in the learned feature space is minimized to eliminate the appearance diversities, whilst the distance between pedestrians and background is maximized. To facilitate the efficiency and effectiveness of contrastive learning, we construct an exemplar dictionary with representative pedestrian appearances as prior knowledge to construct effective contrastive training pairs and thus guide contrastive learning. Besides, the constructed exemplar dictionary is further leveraged to evaluate the quality of pedestrian proposals during inference by measuring the semantic distance between the proposal and the exemplar dictionary. Extensive experiments on both daytime and nighttime pedestrian detection validate the effectiveness of the proposed method.
翻译:典型的行人探测方法侧重于解决拥挤行人之间相互隔离的问题,或者处理行人不同比例的问题。探察具有不同外观多样性的行人,例如不同的行人环形、不同观点或不同的着装,这仍然是一个关键的挑战。我们建议,不象大多数现有方法那样,单独地学习这些不同的行人外观特征,而是进行对比学习,指导特征学习,以便尽可能缩小在已知地物空间不同外观的行人之间的语义距离,消除外观多样性,同时尽可能扩大行人与背景之间的距离。为了提高对比性学习的效率和效果,我们用具有代表性的行人外观来构建一个有代表性的词典,作为建立有效对比式培训配对的知识,从而指导对比性学习。此外,还进一步利用构建的外观词典,通过测量建议与业内字典之间的语义距离,评估推论期间行人建议的质量。关于白天和夜间行人行人探测的广泛实验证实了拟议方法的有效性。