PoseScript: 3D 自然语言的人类论文 (PoseScript: 3D Human Poses from Natural Language) - 专知论文

会员服务 ·

0

3D · 数据集 · Processing（编程语言） · INFORMS · 图像字幕 ·

2022 年 10 月 21 日

PoseScript: 3D Human Poses from Natural Language

翻译：PoseScript: 3D 自然语言的人类论文

Ginger Delmas,Philippe Weinzaepfel,Thomas Lucas,Francesc Moreno-Noguer,Grégory Rogez

from arxiv, Published in ECCV 2022

Natural language is leveraged in many computer vision tasks such as image captioning, cross-modal retrieval or visual question answering, to provide fine-grained semantic information. While human pose is key to human understanding, current 3D human pose datasets lack detailed language descriptions. In this work, we introduce the PoseScript dataset, which pairs a few thousand 3D human poses from AMASS with rich human-annotated descriptions of the body parts and their spatial relationships. To increase the size of this dataset to a scale compatible with typical data hungry learning algorithms, we propose an elaborate captioning process that generates automatic synthetic descriptions in natural language from given 3D keypoints. This process extracts low-level pose information -- the posecodes -- using a set of simple but generic rules on the 3D keypoints. The posecodes are then combined into higher level textual descriptions using syntactic rules. Automatic annotations substantially increase the amount of available data, and make it possible to effectively pretrain deep models for finetuning on human captions. To demonstrate the potential of annotated poses, we show applications of the PoseScript dataset to retrieval of relevant poses from large-scale datasets and to synthetic pose generation, both based on a textual pose description.

翻译：许多计算机视觉任务,如图像字幕、跨模式检索或视觉答题等,都利用自然语言的自然语言语言,以提供精密的语义信息。虽然人类的外观是人类理解的关键,但目前的3D人类的外观数据集缺乏详细的语言描述。在这项工作中,我们引入了PoseScript数据集,该数据集将数千张3D人类的外观与大量人体对身体部分及其空间关系的附加说明的描述相配。为了将这一数据集的大小提高到与典型的数据饥饿学习算法相容的规模,我们提议了一个精心的外观流程,从给定的3D 关键点生成自然语言的自动合成描述。这个流程利用3D 关键点上一套简单但通用的规则提取了低层次的外观数据集。然后,将外观代码组合成更高层次的文字描述,使用同步规则对身体部分进行大量的人文说明。自动说明大大增加了可用数据的数量,并有可能有效地为对人文字幕进行微调的深层模型。为了展示附加的外观外观,我们展示了从高层次的外观的外观到相关的合成结构图状图解的大规模数据检索。

0

相关内容

3D是英文“Three Dimensions”的简称，中文是指三维、三个维度、三个坐标，即有长、有宽、有高，换句话说，就是立体的，是相对于只有长和宽的平面（2D）而言。

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

面向SEM的惯性粘滑驱动跨尺度精密运动机理和实现方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于纳米发电机的自驱动MEMS/NEMS机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

含碳气溶胶光谱特性研究

国家自然科学基金

0+阅读 · 2013年12月31日

2013“数学之星“夏令营

国家自然科学基金

0+阅读 · 2013年7月31日

血小板调控CD4+CD25+Foxp3+T细胞分化和功能在肝移植排斥反应中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

表面带有反应位点的共轭高分子荧光编码微球的制备

国家自然科学基金

0+阅读 · 2011年12月31日

关系的分解与Domain的表示

国家自然科学基金

1+阅读 · 2011年12月31日

多疣壁虎CD59调节再生位置记忆的机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

增强现实中多目标3D跟踪定位和WH-SIFT特征识别方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于2D视频视觉关注度的3D重建方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation

Arxiv

0+阅读 · 2022年12月5日

PartSLIP: Low-Shot Part Segmentation for 3D Point Clouds via Pretrained Image-Language Models

Arxiv

0+阅读 · 2022年12月3日

A text analysis for Operational Risk loss descriptions

Arxiv

0+阅读 · 2022年12月2日

Visual Classification via Description from Large Language Models

Arxiv

1+阅读 · 2022年12月1日

Natural Language Descriptions of Deep Visual Features

Arxiv

12+阅读 · 2022年1月26日

A Survey of Natural Language Generation

Arxiv

15+阅读 · 2021年12月22日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

From Show to Tell: A Survey on Image Captioning

Arxiv

15+阅读 · 2021年7月14日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《面向无人机集群的避障动态传感器覆盖算法》最新38页

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

相关论文

E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation

Arxiv

0+阅读 · 2022年12月5日

PartSLIP: Low-Shot Part Segmentation for 3D Point Clouds via Pretrained Image-Language Models

Arxiv

0+阅读 · 2022年12月3日

A text analysis for Operational Risk loss descriptions

Arxiv

0+阅读 · 2022年12月2日

Visual Classification via Description from Large Language Models

Arxiv

1+阅读 · 2022年12月1日

Natural Language Descriptions of Deep Visual Features

Arxiv

12+阅读 · 2022年1月26日

A Survey of Natural Language Generation

Arxiv

15+阅读 · 2021年12月22日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

From Show to Tell: A Survey on Image Captioning

Arxiv

15+阅读 · 2021年7月14日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

相关基金

面向SEM的惯性粘滑驱动跨尺度精密运动机理和实现方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于纳米发电机的自驱动MEMS/NEMS机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

含碳气溶胶光谱特性研究

国家自然科学基金

0+阅读 · 2013年12月31日

2013“数学之星“夏令营

国家自然科学基金

0+阅读 · 2013年7月31日

血小板调控CD4+CD25+Foxp3+T细胞分化和功能在肝移植排斥反应中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

表面带有反应位点的共轭高分子荧光编码微球的制备

国家自然科学基金

0+阅读 · 2011年12月31日

关系的分解与Domain的表示

国家自然科学基金

1+阅读 · 2011年12月31日

多疣壁虎CD59调节再生位置记忆的机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

增强现实中多目标3D跟踪定位和WH-SIFT特征识别方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于2D视频视觉关注度的3D重建方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员