Ham2Pose: 将手语符号动画化为姿势序列 (Ham2Pose: Animating Sign Language Notation into Pose Sequences) - 专知论文

会员服务 ·

0

序列 · 距离度量 · 度量 · 数据预处理 · 正确性 ·

2023 年 4 月 1 日

Ham2Pose: Animating Sign Language Notation into Pose Sequences

翻译：Ham2Pose: 将手语符号动画化为姿势序列

Rotem Shalev-Arkushin,Amit Moryossef,Ohad Fried

Translating spoken languages into Sign languages is necessary for open communication between the hearing and hearing-impaired communities. To achieve this goal, we propose the first method for animating a text written in HamNoSys, a lexical Sign language notation, into signed pose sequences. As HamNoSys is universal by design, our proposed method offers a generic solution invariant to the target Sign language. Our method gradually generates pose predictions using transformer encoders that create meaningful representations of the text and poses while considering their spatial and temporal information. We use weak supervision for the training process and show that our method succeeds in learning from partial and inaccurate data. Additionally, we offer a new distance measurement that considers missing keypoints, to measure the distance between pose sequences using DTW-MJE. We validate its correctness using AUTSL, a large-scale Sign language dataset, show that it measures the distance between pose sequences more accurately than existing measurements, and use it to assess the quality of our generated pose sequences. Code for the data pre-processing, the model, and the distance measurement is publicly released for future research.

翻译：翻译口语语言为手语，这对听觉和听觉受损社区之间的沟通至关重要。为了实现这一目标，我们提出了第一种方法，将使用 HamNoSys，一种手语符号标记法，书写的文本动画化为姿势序列。由于 HamNoSys 的设计是通用的，我们的提议方法提供了一种不受目标手语影响的通用解决方案。我们的方法使用变压器编码器渐进生成姿势预测，同时考虑它们的空间和时间信息，创建文本和姿势的有意义表征。我们使用弱监督来进行训练，并显示我们的方法成功从部分和不准确的数据中学习。此外，我们提供一种新的距离度量方法，可以考虑缺失关键点，使用 DTW-MJE 计算姿势序列之间的距离。我们使用 AUTSL，一种大规模手语数据集，验证其正确性，并证明其测量姿势序列之间的距离比现有测量方法更准确，并用它来评估我们生成的姿势序列的质量。数据预处理、模型和距离度量的代码已公开释放，供未来研究使用。

0

相关内容

数学上，序列是被排成一列的对象（或事件）；这样每个元素不是在其他元素之前，就是在其他元素之后。这里，元素之间的顺序非常重要。

DeepD2V:用于从组合DNA序列中预测转录因子结合位点的深度学习框架

DeepD2V:用于从组合DNA序列中预测转录因子结合位点的深度学习框架

专知会员服务

4+阅读 · 2022年12月5日

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

专知会员服务

23+阅读 · 2022年3月3日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

专知会员服务

21+阅读 · 2020年6月13日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

【ICCV 2019 Workshop】UGLLI Face Alignment: Estimating Uncertainty with Gaussian Log-Likelihood Loss（UGLLI人脸对齐：估计不确定性与高斯对数似然损失），犹他大学 Abhinav Kumar

【ICCV 2019 Workshop】UGLLI Face Alignment: Estimating Uncertainty with Gaussian Log-Likelihood Loss（UGLLI人脸对齐：估计不确定性与高斯对数似然损失），犹他大学 Abhinav Kumar

专知会员服务

15+阅读 · 2019年10月31日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

泡泡机器人SLAM

11+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

专知

11+阅读 · 2018年6月4日

【论文推荐】最新八篇图像描述生成相关论文—比较级对抗学习、正则化RNNs、深层网络、视觉对话、婴儿说话、自我检索

【论文推荐】最新八篇图像描述生成相关论文—比较级对抗学习、正则化RNNs、深层网络、视觉对话、婴儿说话、自我检索

专知

10+阅读 · 2018年4月12日

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

专知

30+阅读 · 2018年3月22日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

基于人类视觉仿生的高分辨率遥感影像建筑物提取方法研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于RGBD序列的动态物体几何与纹理重建及其数据集建设

国家自然科学基金

0+阅读 · 2013年12月31日

基于多视角唇动时空动态特征的身份识别研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于动态点云的人脸表情建模和编辑方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于视觉认知的图像不变特征提取

国家自然科学基金

0+阅读 · 2011年12月31日

牛Nanog基因启动子区负调控元件功能的研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于局部不变性特征流的相异场景密集匹配

国家自然科学基金

0+阅读 · 2011年12月31日

视角无关的动作识别与行为建模方法研究

国家自然科学基金

0+阅读 · 2010年12月31日

基于视觉感知的多视点视频编码研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于本体的Deep Web搜索技术

国家自然科学基金

2+阅读 · 2009年12月31日

Sin3DM: Learning a Diffusion Model from a Single 3D Textured Shape

Arxiv

0+阅读 · 2023年5月24日

Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy

Arxiv

0+阅读 · 2023年5月24日

Ghostbuster: Detecting Text Ghostwritten by Large Language Models

Arxiv

0+阅读 · 2023年5月24日

Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models

Arxiv

1+阅读 · 2023年5月23日

SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers

Arxiv

0+阅读 · 2023年5月22日

ControlVideo: Training-free Controllable Text-to-Video Generation

Arxiv

0+阅读 · 2023年5月22日

Text-to-SQL Error Correction with Language Models of Code

Arxiv

0+阅读 · 2023年5月22日

RWKV: Reinventing RNNs for the Transformer Era

Arxiv

3+阅读 · 2023年5月22日

Decouple knowledge from paramters for plug-and-play language modeling

Arxiv

0+阅读 · 2023年5月19日

Learning Implicit Fields for Generative Shape Modeling

Learning Implicit Fields for Generative Shape Modeling

Arxiv

10+阅读 · 2018年12月6日

VIP会员

文章信息

相关主题

数据预处理

相关VIP内容

DeepD2V:用于从组合DNA序列中预测转录因子结合位点的深度学习框架

DeepD2V:用于从组合DNA序列中预测转录因子结合位点的深度学习框架

专知会员服务

4+阅读 · 2022年12月5日

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

专知会员服务

23+阅读 · 2022年3月3日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

专知会员服务

21+阅读 · 2020年6月13日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

【ICCV 2019 Workshop】UGLLI Face Alignment: Estimating Uncertainty with Gaussian Log-Likelihood Loss（UGLLI人脸对齐：估计不确定性与高斯对数似然损失），犹他大学 Abhinav Kumar

【ICCV 2019 Workshop】UGLLI Face Alignment: Estimating Uncertainty with Gaussian Log-Likelihood Loss（UGLLI人脸对齐：估计不确定性与高斯对数似然损失），犹他大学 Abhinav Kumar

专知会员服务

15+阅读 · 2019年10月31日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

GPT-5如何对齐？从硬性拒绝到安全完成：走向以输出为中心的安全训练

【伯克利博士论文】超越人类监督的视觉智能

【ICCV2025】SO(3) 上连续非保守动力系统的预测

2025年中国数据要素行业发展研究报告

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

泡泡机器人SLAM

11+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

专知

11+阅读 · 2018年6月4日

【论文推荐】最新八篇图像描述生成相关论文—比较级对抗学习、正则化RNNs、深层网络、视觉对话、婴儿说话、自我检索

【论文推荐】最新八篇图像描述生成相关论文—比较级对抗学习、正则化RNNs、深层网络、视觉对话、婴儿说话、自我检索

专知

10+阅读 · 2018年4月12日

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

专知

30+阅读 · 2018年3月22日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

相关论文

Sin3DM: Learning a Diffusion Model from a Single 3D Textured Shape

Arxiv

0+阅读 · 2023年5月24日

Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy

Arxiv

0+阅读 · 2023年5月24日

Ghostbuster: Detecting Text Ghostwritten by Large Language Models

Arxiv

0+阅读 · 2023年5月24日

Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models

Arxiv

1+阅读 · 2023年5月23日

SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers

Arxiv

0+阅读 · 2023年5月22日

ControlVideo: Training-free Controllable Text-to-Video Generation

Arxiv

0+阅读 · 2023年5月22日

Text-to-SQL Error Correction with Language Models of Code

Arxiv

0+阅读 · 2023年5月22日

RWKV: Reinventing RNNs for the Transformer Era

Arxiv

3+阅读 · 2023年5月22日

Decouple knowledge from paramters for plug-and-play language modeling

Arxiv

0+阅读 · 2023年5月19日

Learning Implicit Fields for Generative Shape Modeling

Learning Implicit Fields for Generative Shape Modeling

Arxiv

10+阅读 · 2018年12月6日

相关基金

基于人类视觉仿生的高分辨率遥感影像建筑物提取方法研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于RGBD序列的动态物体几何与纹理重建及其数据集建设

国家自然科学基金

0+阅读 · 2013年12月31日

基于多视角唇动时空动态特征的身份识别研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于动态点云的人脸表情建模和编辑方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于视觉认知的图像不变特征提取

国家自然科学基金

0+阅读 · 2011年12月31日

牛Nanog基因启动子区负调控元件功能的研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于局部不变性特征流的相异场景密集匹配

国家自然科学基金

0+阅读 · 2011年12月31日

视角无关的动作识别与行为建模方法研究

国家自然科学基金

0+阅读 · 2010年12月31日

基于视觉感知的多视点视频编码研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于本体的Deep Web搜索技术

国家自然科学基金

2+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员