探索领域泛化背景下的人群定位尺度偏移问题 (Exploring Scale Shift in Crowd Localization under the Context of Domain Generalization)

Crowd localization plays a crucial role in visual scene understanding towards predicting each pedestrian location in a crowd, thus being applicable to various downstream tasks. However, existing approaches suffer from significant performance degradation due to discrepancies in head scale distributions (scale shift) between training and testing data, a challenge known as domain generalization (DG). This paper aims to comprehend the nature of scale shift within the context of domain generalization for crowd localization models. To this end, we address four critical questions: (i) How does scale shift influence crowd localization in a DG scenario? (ii) How can we quantify this influence? (iii) What causes this influence? (iv) How to mitigate the influence? Initially, we conduct a systematic examination of how crowd localization performance varies with different levels of scale shift. Then, we establish a benchmark, ScaleBench, and reproduce 20 advanced DG algorithms to quantify the influence. Through extensive experiments, we demonstrate the limitations of existing algorithms and underscore the importance and complexity of scale shift, a topic that remains insufficiently explored. To deepen our understanding, we provide a rigorous theoretical analysis on scale shift. Building on these insights, we further propose an effective algorithm called Causal Feature Decomposition and Anisotropic Processing (Catto) to mitigate the influence of scale shift in DG settings. Later, we also provide extensive analytical experiments, revealing four significant insights for future research. Our results emphasize the importance of this novel and applicable research direction, which we term Scale Shift Domain Generalization.

翻译：人群定位在视觉场景理解中发挥着关键作用，旨在预测人群中每个行人的位置，因而适用于多种下游任务。然而，由于训练数据与测试数据在头部尺度分布上存在差异（尺度偏移），现有方法常面临显著的性能下降，这一挑战被称为领域泛化（DG）。本文旨在从领域泛化的角度，深入理解人群定位模型中尺度偏移的本质。为此，我们探讨了四个关键问题：（i）在DG场景下，尺度偏移如何影响人群定位？（ii）如何量化这种影响？（iii）这种影响由何引起？（iv）如何减轻这种影响？首先，我们系统性地研究了人群定位性能如何随不同尺度偏移水平而变化。随后，我们建立了一个基准测试集ScaleBench，并复现了20种先进的DG算法以量化其影响。通过大量实验，我们揭示了现有算法的局限性，并强调了尺度偏移这一尚未被充分探索的问题的重要性和复杂性。为深化理解，我们对尺度偏移进行了严格的理论分析。基于这些发现，我们进一步提出了一种名为因果特征分解与各向异性处理（Catto）的有效算法，以减轻DG设置中尺度偏移的影响。此后，我们还提供了广泛的分析性实验，揭示了未来研究的四个重要启示。我们的结果强调了这一新颖且具有应用价值的研究方向的重要性，我们将其命名为尺度偏移领域泛化。