项目名称: 基于深度学习的特征融合在移动机器人视觉中的场景理解及研究
项目编号: No.61463032
项目类型: 地区科学基金项目
立项/批准年度: 2015
项目学科: 其他
项目作者: 李菁
作者单位: 南昌大学
项目金额: 44万元
中文摘要: 如何使机器人更好地理解其所在工作环境,是长久以来国内外学者密切关注并积极探讨的具有挑战性的研究课题之一。对于工作在复杂场景中的移动机器人系统,具有与人类相类似的环境认知能力是其能够自主运行的前提条件。然而,基于视觉传感器的场景理解常面临如下难点:1)图像的采集:如何避免机器人获取的图像中不包含目标物体?2)目标的特征表达和学习:场景理解作为高层视觉任务,其基础是底层视觉。如何有效融合多源特征对目标进行准确描述并减少人工设计特征的工作量?3)特征降维:如何获得更鲁棒的特征,使机器人能够实时理解环境?本项目从计算机视觉出发,拟建立一个基于深度学习的上下镜理解系统。通过采集及分析全方位视觉传感器和Kinect获取的图像,结合生物启发性的特征提取,用深度学习的方法进行特征学习,并设计一种新型流形学习方法对特征进行降维,实现场景理解的自适应性和实时性,为机器人视觉导航系统提供重要的技术支撑。
中文关键词: 计算机视觉;特征提取;场景理解;深度学习;机器人视觉
英文摘要: How to help the robots better understand their working environments is one of the most challenging worldwide research topics. For the mobile robot systems in complex scenes, the robots are assumed to have similar cognition ability as human beings. However, existing vision sensor-based scene understanding entails the following challenges: 1) The collection of image databases: how to avoid that the pictures captured by the robots contain no objects? 2) Feature representation and learning: scene understanding is a high-lever vision task with low-level vision as its basis. In feature representation, how to effectively fuse the features from different sources in order to describe the objects accurately and reduce the burden of manually designed features? 3) Dimensionality reduction: robots frequently find it difficult to recognize objects and successfully complete assigned tasks in challenging scenarios, e.g., scenes with a significant amount of clutter. How can we obtain more robust features and help the robots understand scenes in real-time? Based on computer vision techniques, this project aims to construct a deep learning-based scene understanding system by: i) collecting and analyzing the images taken from Omni-directional vision sensor and Microsoft Kinect; ii) extracting the biological-inspired gist features and saliency features and conducting deep learning to learn effective fused features; and iii) designing a new manifold learning algorithm to reduce the dimensionality of feature vectors to achieve the adaptability and real-time performance in scene understanding. The system improves the ability of robots by fully utilizing the information encoded in visual inputs for scene understanding. It achieves the effectiveness, self-adaption, real-time performance, and is hence helpful for the widespread deployment of navigation systems in robot vision.
英文关键词: Computer Vision;Feature Extraction;Scene Understanding;Deep Learning;Robot Vision