Due to domain shift, a large performance drop is usually observed when a trained crowd counting model is deployed in the wild. While existing domain-adaptive crowd counting methods achieve promising results, they typically regard each crowd image as a whole and reduce domain discrepancies in a holistic manner, thus limiting further improvement of domain adaptation performance. To this end, we propose to untangle \emph{domain-invariant} crowd and \emph{domain-specific} background from crowd images and design a fine-grained domain adaption method for crowd counting. Specifically, to disentangle crowd from background, we propose to learn crowd segmentation from point-level crowd counting annotations in a weakly-supervised manner. Based on the derived segmentation, we design a crowd-aware domain adaptation mechanism consisting of two crowd-aware adaptation modules, i.e., Crowd Region Transfer (CRT) and Crowd Density Alignment (CDA). The CRT module is designed to guide crowd features transfer across domains beyond background distractions. The CDA module dedicates to regularising target-domain crowd density generation by its own crowd density distribution. Our method outperforms previous approaches consistently in the widely-used adaptation scenarios.
翻译:由于存在领域转移问题,当训练好的人群计数模型用于实际场景时,通常会出现性能大幅下降的问题。虽然现有的领域自适应人群计数方法实现了良好的效果,但这些方法通常将每幅人群图像视为整体进行处理,并通过一种整体性的方式减少领域之间的差异,因此限制了领域自适应性能的进一步提升。为此,我们提出了一种方法将领域不变的人群与领域特定的背景从人群图像中分离出来,设计了一种细粒度领域适应方法来进行人群计数。具体而言,为了从弱监督的点级人群计数注释中学习人群分割,我们提出了一种学习人群分割的方法。基于导出的分割,设计了一种人群感知的领域自适应机制,包括两个人群感知适应模块:人群区域转移(CRT)和人群密度对齐(CDA)。CRT模块旨在引导人群特征的转移,以克服背景干扰。CDA模块则致力于通过其自身人群密度分布来规范目标域的人群密度生成。我们的方法在广泛使用的适应场景中稳定地优于先前的方法。