与变形器一起对端到端可培训的多 Instance 模拟模拟 (End-to-End Trainable Multi-Instance Pose Estimation with Transformers) - 专知论文

会员服务 ·

0

POET · 估计/估计量 · 端到端 · 变换 · 损失 ·

2021 年 12 月 21 日

End-to-End Trainable Multi-Instance Pose Estimation with Transformers

翻译：与变形器一起对端到端可培训的多 Instance 模拟模拟

Lucas Stoffl,Maxime Vidal,Alexander Mathis

We propose an end-to-end trainable approach for multi-instance pose estimation, called POET (POse Estimation Transformer). Combining a convolutional neural network with a transformer encoder-decoder architecture, we formulate multiinstance pose estimation from images as a direct set prediction problem. Our model is able to directly regress the pose of all individuals, utilizing a bipartite matching scheme. POET is trained using a novel set-based global loss that consists of a keypoint loss, a visibility loss and a class loss. POET reasons about the relations between multiple detected individuals and the full image context to directly predict their poses in parallel. We show that POET achieves high accuracy on the COCO keypoint detection task while having less parameters and higher inference speed than other bottom-up and top-down approaches. Moreover, we show successful transfer learning when applying POET to animal pose estimation. To the best of our knowledge, this model is the first end-to-end trainable multi-instance pose estimation method and we hope it will serve as a simple and promising alternative.

翻译：我们提议了一种最终到最终可培训的多因子构成估计方法,称为POET(POSE Estimation 变异器)。结合一个革命性神经网络和一个变压器编码器-解码器结构,我们从图像中制定多重因子构成估计,作为直接设定的预测问题。我们的模型能够利用双向匹配方案,直接回归所有个人构成。POET是使用由关键点损失、可见度损失和阶级损失组成的基于新颖的一组全球损失来培训的。POET是多个被检测到的个人之间的关系以及直接同时预测其构成的全面图像背景的原因。我们表明,在COCO关键点探测任务上,POET在比其他自下至上和自上而下方法的参数和推断速度较低的同时,取得了很高的准确性。此外,我们展示了在应用POET对动物构成估计时的成功转移学习。据我们所知,这一模型是第一个端到端可培训的多因子估计方法,我们希望它能够成为一个简单而有希望的替代方法。

0

相关内容

POET

远程监督关系抽取综述

专知会员服务

35+阅读 · 2021年8月19日

东京大学 | TrTr：基于Transformer的目标跟踪

专知会员服务

36+阅读 · 2021年5月12日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

专知会员服务

33+阅读 · 2020年10月11日

最新《非光滑优化》十讲硬核课程，剑桥大学梁经纬博士主讲

最新《非光滑优化》十讲硬核课程，剑桥大学梁经纬博士主讲

专知会员服务

33+阅读 · 2020年8月14日

【CVPR2020】时序分组注意力视频超分

【CVPR2020】时序分组注意力视频超分

专知会员服务

31+阅读 · 2020年7月1日

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

专知会员服务

32+阅读 · 2020年5月14日

学习具有层次标签的图像表示，Learning Representations For Images With Hierarchical Labels

学习具有层次标签的图像表示，Learning Representations For Images With Hierarchical Labels

专知会员服务

38+阅读 · 2020年4月6日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

不需要预训练模型的检测算法—DSOD

不需要预训练模型的检测算法—DSOD

极市平台

9+阅读 · 2019年10月10日

人体姿态估计资源大列表（Human Pose Estimation）

人体姿态估计资源大列表（Human Pose Estimation）

专知

9+阅读 · 2018年10月6日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

已删除

将门创投

4+阅读 · 2018年6月1日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

【论文推荐】最新5篇视觉目标跟踪相关论文—递归神经网络、深度适应计算策略、视觉目标跟踪基准、深度核化相关滤波、检测并跟踪

【论文推荐】最新5篇视觉目标跟踪相关论文—递归神经网络、深度适应计算策略、视觉目标跟踪基准、深度核化相关滤波、检测并跟踪

专知

14+阅读 · 2018年1月22日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Human Pose Regression with Residual Log-likelihood Estimation

Arxiv

4+阅读 · 2021年7月26日

HuMoR: 3D Human Motion Model for Robust Pose Estimation

Arxiv

3+阅读 · 2021年5月10日

End-to-End Video Instance Segmentation with Transformers

Arxiv

10+阅读 · 2021年3月24日

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Arxiv

10+阅读 · 2020年12月31日

End-to-end Lane Shape Prediction with Transformers

Arxiv

3+阅读 · 2020年11月28日

Simple Multi-Resolution Representation Learning for Human Pose Estimation

Simple Multi-Resolution Representation Learning for Human Pose Estimation

Arxiv

6+阅读 · 2020年4月14日

Deep High-Resolution Representation Learning for Human Pose Estimation

Arxiv

5+阅读 · 2019年2月25日

Detect-and-Track: Efficient Pose Estimation in Videos

Arxiv

5+阅读 · 2018年5月2日

End-to-End Dense Video Captioning with Masked Transformer

Arxiv

14+阅读 · 2018年4月3日

Dual Path Networks for Multi-Person Human Pose Estimation

Arxiv

3+阅读 · 2017年10月27日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

远程监督关系抽取综述

专知会员服务

35+阅读 · 2021年8月19日

东京大学 | TrTr：基于Transformer的目标跟踪

专知会员服务

36+阅读 · 2021年5月12日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

专知会员服务

33+阅读 · 2020年10月11日

最新《非光滑优化》十讲硬核课程，剑桥大学梁经纬博士主讲

最新《非光滑优化》十讲硬核课程，剑桥大学梁经纬博士主讲

专知会员服务

33+阅读 · 2020年8月14日

【CVPR2020】时序分组注意力视频超分

【CVPR2020】时序分组注意力视频超分

专知会员服务

31+阅读 · 2020年7月1日

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

专知会员服务

32+阅读 · 2020年5月14日

学习具有层次标签的图像表示，Learning Representations For Images With Hierarchical Labels

学习具有层次标签的图像表示，Learning Representations For Images With Hierarchical Labels

专知会员服务

38+阅读 · 2020年4月6日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《生成式人工智能与大/小语言模型在供应链管理决策优化与可持续性提升中的作用评估》最新51页

白宫发布《赢得AI竞赛：美国人工智能行动计划》最新28页

地下战：地下空间的战略博弈

《美地下作战条令手册》228页

相关资讯

不需要预训练模型的检测算法—DSOD

不需要预训练模型的检测算法—DSOD

极市平台

9+阅读 · 2019年10月10日

人体姿态估计资源大列表（Human Pose Estimation）

人体姿态估计资源大列表（Human Pose Estimation）

专知

9+阅读 · 2018年10月6日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

已删除

将门创投

4+阅读 · 2018年6月1日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

【论文推荐】最新5篇视觉目标跟踪相关论文—递归神经网络、深度适应计算策略、视觉目标跟踪基准、深度核化相关滤波、检测并跟踪

【论文推荐】最新5篇视觉目标跟踪相关论文—递归神经网络、深度适应计算策略、视觉目标跟踪基准、深度核化相关滤波、检测并跟踪

专知

14+阅读 · 2018年1月22日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Human Pose Regression with Residual Log-likelihood Estimation

Arxiv

4+阅读 · 2021年7月26日

HuMoR: 3D Human Motion Model for Robust Pose Estimation

Arxiv

3+阅读 · 2021年5月10日

End-to-End Video Instance Segmentation with Transformers

Arxiv

10+阅读 · 2021年3月24日

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Arxiv

10+阅读 · 2020年12月31日

End-to-end Lane Shape Prediction with Transformers

Arxiv

3+阅读 · 2020年11月28日

Simple Multi-Resolution Representation Learning for Human Pose Estimation

Simple Multi-Resolution Representation Learning for Human Pose Estimation

Arxiv

6+阅读 · 2020年4月14日

Deep High-Resolution Representation Learning for Human Pose Estimation

Arxiv

5+阅读 · 2019年2月25日

Detect-and-Track: Efficient Pose Estimation in Videos

Arxiv

5+阅读 · 2018年5月2日

End-to-End Dense Video Captioning with Masked Transformer

Arxiv

14+阅读 · 2018年4月3日

Dual Path Networks for Multi-Person Human Pose Estimation

Arxiv

3+阅读 · 2017年10月27日

微信扫码咨询专知VIP会员