CCTrans: 以变换器简化和改进人群计数 (CCTrans: Simplifying and Improving Crowd Counting with Transformer) - 专知论文

会员服务 ·

0

Pyramid · 变换 · Extensibility · 膨胀卷积 · 卷积神经网络 ·

2021 年 9 月 29 日

CCTrans: Simplifying and Improving Crowd Counting with Transformer

翻译：CCTrans: 以变换器简化和改进人群计数

Ye Tian,Xiangxiang Chu,Hongpeng Wang

Most recent methods used for crowd counting are based on the convolutional neural network (CNN), which has a strong ability to extract local features. But CNN inherently fails in modeling the global context due to the limited receptive fields. However, the transformer can model the global context easily. In this paper, we propose a simple approach called CCTrans to simplify the design pipeline. Specifically, we utilize a pyramid vision transformer backbone to capture the global crowd information, a pyramid feature aggregation (PFA) model to combine low-level and high-level features, an efficient regression head with multi-scale dilated convolution (MDC) to predict density maps. Besides, we tailor the loss functions for our pipeline. Without bells and whistles, extensive experiments demonstrate that our method achieves new state-of-the-art results on several benchmarks both in weakly and fully-supervised crowd counting. Moreover, we currently rank No.1 on the leaderboard of NWPU-Crowd. Our code will be made available.

翻译：最近的人群计数方法基于具有很强提取本地特征能力的连锁神经网络(CNN ) 。但是CNN在建模全球背景方面注定失败, 原因是可接收域有限。但是, 变压器可以很容易地建模全球背景。在本文中, 我们提出一个简单的方法, 叫做 CCTrans 来简化设计管道。具体地说, 我们使用金字塔的视觉变压器主干网来捕捉全球人群信息, 一个金字塔特征集合模型, 将低层次和高层次特征结合起来, 一个高效的回归头, 具有多尺度的扩展式共振动( MDC ) 来预测密度地图。此外, 我们为输油管定制了损失功能。没有钟声和哨子, 广泛的实验表明我们的方法在几个基准上取得了新的最新效果, 包括弱和完全监控的人群计数。此外, 我们目前将位于 NWPU rowd 的首板上排名第 1号。我们的代码将会被提供。

0

相关内容

Pyramid

Pyramid is a small, fast, down-to-earth Python web application development framework.

【ICCV2021】基于Transformer 的神经绘画

专知会员服务

23+阅读 · 2021年9月20日

【斯坦福CS224N硬核课】Transformers模型详解，50页ppt

专知会员服务

61+阅读 · 2021年2月16日

【AAAI2021最佳论文】基于高效 Transformer 的长时间序列预测

【AAAI2021最佳论文】基于高效 Transformer 的长时间序列预测

专知会员服务

62+阅读 · 2021年2月6日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Transformer模型-深度学习自然语言处理，17页ppt

Transformer模型-深度学习自然语言处理，17页ppt

专知会员服务

107+阅读 · 2020年8月30日

【Google】多模态Transformer视频检索，Multi-modal Transformer

【Google】多模态Transformer视频检索，Multi-modal Transformer

专知会员服务

103+阅读 · 2020年7月22日

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

已删除

将门创投

3+阅读 · 2018年8月21日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Local-Selective Feature Distillation for Single Image Super-Resolution

Arxiv

0+阅读 · 2021年11月22日

SimMIM: A Simple Framework for Masked Image Modeling

Arxiv

0+阅读 · 2021年11月18日

A Novel Transformer based Semantic Segmentation Scheme for Fine-Resolution Remote Sensing Images

Arxiv

1+阅读 · 2021年11月18日

HiT: Hierarchical Transformer with Momentum Contrast for Video-Text Retrieval

Arxiv

7+阅读 · 2021年8月18日

Transformer Tracking

Arxiv

17+阅读 · 2021年3月29日

End-to-End Video Instance Segmentation with Transformers

Arxiv

10+阅读 · 2021年3月24日

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Arxiv

10+阅读 · 2020年12月31日

Residual Non-local Attention Networks for Image Restoration

Arxiv

9+阅读 · 2019年3月24日

A Fully Convolutional Two-Stream Fusion Network for Interactive Image Segmentation

Arxiv

3+阅读 · 2018年10月2日

Leveraging Unlabeled Data for Crowd Counting by Learning to Rank

Arxiv

6+阅读 · 2018年3月8日

VIP会员

文章信息

相关主题

卷积神经网络

相关VIP内容

【ICCV2021】基于Transformer 的神经绘画

专知会员服务

23+阅读 · 2021年9月20日

【斯坦福CS224N硬核课】Transformers模型详解，50页ppt

专知会员服务

61+阅读 · 2021年2月16日

【AAAI2021最佳论文】基于高效 Transformer 的长时间序列预测

【AAAI2021最佳论文】基于高效 Transformer 的长时间序列预测

专知会员服务

62+阅读 · 2021年2月6日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Transformer模型-深度学习自然语言处理，17页ppt

Transformer模型-深度学习自然语言处理，17页ppt

专知会员服务

107+阅读 · 2020年8月30日

【Google】多模态Transformer视频检索，Multi-modal Transformer

【Google】多模态Transformer视频检索，Multi-modal Transformer

专知会员服务

103+阅读 · 2020年7月22日

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

已删除

将门创投

3+阅读 · 2018年8月21日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

相关论文

Local-Selective Feature Distillation for Single Image Super-Resolution

Arxiv

0+阅读 · 2021年11月22日

SimMIM: A Simple Framework for Masked Image Modeling

Arxiv

0+阅读 · 2021年11月18日

A Novel Transformer based Semantic Segmentation Scheme for Fine-Resolution Remote Sensing Images

Arxiv

1+阅读 · 2021年11月18日

HiT: Hierarchical Transformer with Momentum Contrast for Video-Text Retrieval

Arxiv

7+阅读 · 2021年8月18日

Transformer Tracking

Arxiv

17+阅读 · 2021年3月29日

End-to-End Video Instance Segmentation with Transformers

Arxiv

10+阅读 · 2021年3月24日

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Arxiv

10+阅读 · 2020年12月31日

Residual Non-local Attention Networks for Image Restoration

Arxiv

9+阅读 · 2019年3月24日

A Fully Convolutional Two-Stream Fusion Network for Interactive Image Segmentation

Arxiv

3+阅读 · 2018年10月2日

Leveraging Unlabeled Data for Crowd Counting by Learning to Rank

Arxiv

6+阅读 · 2018年3月8日

微信扫码咨询专知VIP会员