启用跨设备感应的声音伴随手至脸手势识别 (Enabling Voice-Accompanying Hand-to-Face Gesture Recognition with Cross-Device Sensing) - 专知论文

会员服务 ·

0

INTERACT · Analysis · 通道 · CASES · Principle ·

2023 年 3 月 18 日

Enabling Voice-Accompanying Hand-to-Face Gesture Recognition with Cross-Device Sensing

翻译：启用跨设备感应的声音伴随手至脸手势识别

Zisu Li,Cheng Liang,Yuntao Wang,Yue Qin,Chun Yu,Yukang Yan,Mingming Fan,Yuanchun Shi

from arxiv, This paper has been accepted by ACM CHI 2023

Gestures performed accompanying the voice are essential for voice interaction to convey complementary semantics for interaction purposes such as wake-up state and input modality. In this paper, we investigated voice-accompanying hand-to-face (VAHF) gestures for voice interaction. We targeted hand-to-face gestures because such gestures relate closely to speech and yield significant acoustic features (e.g., impeding voice propagation). We conducted a user study to explore the design space of VAHF gestures, where we first gathered candidate gestures and then applied a structural analysis to them in different dimensions (e.g., contact position and type), outputting a total of 8 VAHF gestures with good usability and least confusion. To facilitate VAHF gesture recognition, we proposed a novel cross-device sensing method that leverages heterogeneous channels (vocal, ultrasound, and IMU) of data from commodity devices (earbuds, watches, and rings). Our recognition model achieved an accuracy of 97.3% for recognizing 3 gestures and 91.5% for recognizing 8 gestures, excluding the "empty" gesture, proving the high applicability. Quantitative analysis also sheds light on the recognition capability of each sensor channel and their different combinations. In the end, we illustrated the feasible use cases and their design principles to demonstrate the applicability of our system in various scenarios.

翻译：声音伴随的手至脸手势对于语音交互来说非常重要，因为这种手势通常被用于传达各种信息，如唤醒状态和输入方式等。本文针对声音交互中的手至脸手势进行了研究，因为这种手势与语音紧密相关，并产生显著的声学特征（例如阻碍声音传播）。我们进行了一项用户研究，探索了手至脸手势的设计空间，首先收集了候选手势，然后在不同维度（例如接触位置和类型）上进行了结构分析，总共输出了8种具有良好可用性和最小混淆的声音伴随手至脸手势。为了促进声音伴随手至脸手势的识别，我们提出了一种新的跨设备感应方法，利用来自商品设备（耳塞、手表和戒指）的异构信道（语音、超声波和IMU）的数据。我们的识别模型在识别3个手势时达到了97.3％的准确率，在识别8个手势（不包括“空”手势）时达到了91.5％的准确率，证明了其高度适用性。定量分析还揭示了每个传感器通道及其不同组合的识别能力。最后，我们阐述了可行的使用案例及其设计原则，以展示我们的系统在各种场景中的适用性。

0

相关内容

INTERACT

IFIP TC13 Conference on Human-Computer Interaction是人机交互领域的研究者和实践者展示其工作的重要平台。多年来，这些会议吸引了来自几个国家和文化的研究人员。官网链接：http://interact2019.org/

《基于边缘智能的可穿戴多模态手势识别》美空军2023最新38页报告

《基于边缘智能的可穿戴多模态手势识别》美空军2023最新38页报告

专知会员服务

49+阅读 · 2023年4月28日

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

专知会员服务

73+阅读 · 2020年7月28日

【WWW2020-北京大学】多模态多轮对话系统，Multi-Modality in Multi-Turn Dialog

【WWW2020-北京大学】多模态多轮对话系统，Multi-Modality in Multi-Turn Dialog

专知会员服务

58+阅读 · 2020年3月13日

【NUS】神经问题生成的最近进展（Recent Advances in Neural Question Generation）

【NUS】神经问题生成的最近进展（Recent Advances in Neural Question Generation）

专知会员服务

16+阅读 · 2019年12月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【ICDAR2019教程】计算机视觉中的文本形式，Vision and Language: the text modality in computer vision

【ICDAR2019教程】计算机视觉中的文本形式，Vision and Language: the text modality in computer vision

专知会员服务

25+阅读 · 2019年9月21日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

离子液体基磁流体的制备及其在外场中的润滑特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

玉米穗粒腐病表观遗传修饰及关键调控因子的鉴定

国家自然科学基金

0+阅读 · 2014年12月31日

腺嘌呤去甲基化酶FTO对2型糖尿病糖脂代谢关键基因的调控作用

国家自然科学基金

0+阅读 · 2014年12月31日

非金属衬底上石墨烯纳米带的电子结构及输运性质调控研究

国家自然科学基金

0+阅读 · 2013年12月31日

从microRNA-132对DC、CD4+T细胞的调控探讨Behcet病发病机制

国家自然科学基金

0+阅读 · 2013年12月31日

听力基因prestin在回声定位哺乳动物中的功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

d-电子金属复合氧化物纳米材料的化学非整比及其调控

国家自然科学基金

0+阅读 · 2012年12月31日

基于局部表面振荡的湍流边界层闭环控制实验研究

国家自然科学基金

0+阅读 · 2011年12月31日

数字语音真实性的主动取证及其安全性研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于局部不变性特征流的相异场景密集匹配

国家自然科学基金

0+阅读 · 2011年12月31日

Inter-SubNet: Speech Enhancement with Subband Interaction

Arxiv

0+阅读 · 2023年5月9日

Group Activity Recognition via Dynamic Composition and Interaction

Group Activity Recognition via Dynamic Composition and Interaction

Arxiv

0+阅读 · 2023年5月9日

Integrating Holistic and Local Information to Estimate Emotional Reaction Intensity

Arxiv

0+阅读 · 2023年5月9日

A Prospective Approach for Human-to-Human Interaction Recognition from Wi-Fi Channel Data using Attention Bidirectional Gated Recurrent Neural Network with GUI Application Implementation

Arxiv

0+阅读 · 2023年5月9日

Toward Connecting Speech Acts and Search Actions in Conversational Search Tasks

Arxiv

0+阅读 · 2023年5月8日

FashionTex: Controllable Virtual Try-on with Text and Texture

Arxiv

0+阅读 · 2023年5月8日

On computing the vertex connectivity of 1-plane graphs

Arxiv

0+阅读 · 2023年5月5日

High-Level Context Representation for Emotion Recognition in Images

Arxiv

0+阅读 · 2023年5月5日

Rescue Conversations from Dead-ends: Efficient Exploration for Task-oriented Dialogue Policy Optimization

Arxiv

0+阅读 · 2023年5月5日

A Comprehensive Survey on Multimodal Recommender Systems: Taxonomy, Evaluation, and Future Directions

Arxiv

16+阅读 · 2023年2月9日

VIP会员

文章信息

相关主题

相关VIP内容

《基于边缘智能的可穿戴多模态手势识别》美空军2023最新38页报告

《基于边缘智能的可穿戴多模态手势识别》美空军2023最新38页报告

专知会员服务

49+阅读 · 2023年4月28日

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

专知会员服务

73+阅读 · 2020年7月28日

【WWW2020-北京大学】多模态多轮对话系统，Multi-Modality in Multi-Turn Dialog

【WWW2020-北京大学】多模态多轮对话系统，Multi-Modality in Multi-Turn Dialog

专知会员服务

58+阅读 · 2020年3月13日

【NUS】神经问题生成的最近进展（Recent Advances in Neural Question Generation）

【NUS】神经问题生成的最近进展（Recent Advances in Neural Question Generation）

专知会员服务

16+阅读 · 2019年12月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【ICDAR2019教程】计算机视觉中的文本形式，Vision and Language: the text modality in computer vision

【ICDAR2019教程】计算机视觉中的文本形式，Vision and Language: the text modality in computer vision

专知会员服务

25+阅读 · 2019年9月21日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】扩展可扩展会话推荐的边界

别想太多：高效 R1 风格大型推理模型综述

【ACMMM2025】EvoVLMA: 进化式视觉-语言模型自适应

智能体网络：用AI智能体编织下一代网络

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

相关论文

Inter-SubNet: Speech Enhancement with Subband Interaction

Arxiv

0+阅读 · 2023年5月9日

Group Activity Recognition via Dynamic Composition and Interaction

Group Activity Recognition via Dynamic Composition and Interaction

Arxiv

0+阅读 · 2023年5月9日

Integrating Holistic and Local Information to Estimate Emotional Reaction Intensity

Arxiv

0+阅读 · 2023年5月9日

A Prospective Approach for Human-to-Human Interaction Recognition from Wi-Fi Channel Data using Attention Bidirectional Gated Recurrent Neural Network with GUI Application Implementation

Arxiv

0+阅读 · 2023年5月9日

Toward Connecting Speech Acts and Search Actions in Conversational Search Tasks

Arxiv

0+阅读 · 2023年5月8日

FashionTex: Controllable Virtual Try-on with Text and Texture

Arxiv

0+阅读 · 2023年5月8日

On computing the vertex connectivity of 1-plane graphs

Arxiv

0+阅读 · 2023年5月5日

High-Level Context Representation for Emotion Recognition in Images

Arxiv

0+阅读 · 2023年5月5日

Rescue Conversations from Dead-ends: Efficient Exploration for Task-oriented Dialogue Policy Optimization

Arxiv

0+阅读 · 2023年5月5日

A Comprehensive Survey on Multimodal Recommender Systems: Taxonomy, Evaluation, and Future Directions

Arxiv

16+阅读 · 2023年2月9日

相关基金

离子液体基磁流体的制备及其在外场中的润滑特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

玉米穗粒腐病表观遗传修饰及关键调控因子的鉴定

国家自然科学基金

0+阅读 · 2014年12月31日

腺嘌呤去甲基化酶FTO对2型糖尿病糖脂代谢关键基因的调控作用

国家自然科学基金

0+阅读 · 2014年12月31日

非金属衬底上石墨烯纳米带的电子结构及输运性质调控研究

国家自然科学基金

0+阅读 · 2013年12月31日

从microRNA-132对DC、CD4+T细胞的调控探讨Behcet病发病机制

国家自然科学基金

0+阅读 · 2013年12月31日

听力基因prestin在回声定位哺乳动物中的功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

d-电子金属复合氧化物纳米材料的化学非整比及其调控

国家自然科学基金

0+阅读 · 2012年12月31日

基于局部表面振荡的湍流边界层闭环控制实验研究

国家自然科学基金

0+阅读 · 2011年12月31日

数字语音真实性的主动取证及其安全性研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于局部不变性特征流的相异场景密集匹配

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员