项目名称: 语音及情感语义同步的三维人脸可视化:从发声器官到外观
项目编号: No.61472393
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 计算机科学学科
项目作者: 汪增福
作者单位: 中国科学院合肥物质科学研究院
项目金额: 80万元
中文摘要: 本项目从多模态人机交互问题入手,系统开展语音及情感语义同步的三维人脸可视化研究。总体研究目标如下:充分利用核磁共振成像(MRI)、电磁发音数据采集(EMA)和X光成像等多种发音信息获取手段,设计并实现文本和语音多种输入驱动的三维人脸动画合成方案,实际构建出语音和语义情感同步的、能够从内到外展示发音过程的实时高自然度三维情感人脸动画合成系统。针对系统实现过程中所面临的可实现性与高自然度之间、计算复杂度和实时性之间存在的矛盾和难题,从系统的角度,对多源发音数据融合、基于三维模型的人脸动画合成、三维发音器官运动建模、发音器官和语音的协同关系建模等诸问题进行深入研究,形成与之相关的关键技术并实际构建出以这些关键技术为基本构成元素的、绘声绘影的语音三维可视化系统,为研究走向实用化奠定基础。
中文关键词: 虚拟现实;人脸动画;可视化
英文摘要: The project focuses on the problem of multimodal human machine interaction. We will do research on speech and emotional semantic tagging synchronized 3D facial visualization. It is expected to achieve the following goals: by making full use of multiple pronunciation related information acquisition devices including the Magnetic Resonance Imaging (MRI), the Electro-Magnetic Articulography (EMA) and the X-ray imaging, we will present a facial animation generation scheme driven by text or (and) speech, and construct a high realistic and speech and emotional semantic tagging synchronized 3D facial visualization system which can run in real-time and show the detailed dynamic process of pronunciation from internal articulators to external appearances. In order to solve the problems between realizability and high degree of natural, and computational complexity and real-time in process of system implementation, we will address the problems such as sensor date fusion of multiple articulators, facial animation based on 3D head model, 3D dynamic modeling of articulators, and cooperative relation modeling between articulators and speech, form the corresponding key techniques and use them to construct vivid speech and emotional semantic tagging synchronized 3D facial visualization system and provide a concrete foundation for applications.
英文关键词: Virtual Reality;Facial Animation;Visualization