Vit Cane:视障者的视觉助理 (ViT Cane: Visual Assistant for the Visually Impaired)

Blind and visually challenged face multiple issues with navigating the world independently. Some of these challenges include finding the shortest path to a destination and detecting obstacles from a distance. To tackle this issue, this paper proposes ViT Cane, which leverages a vision transformer model in order to detect obstacles in real-time. Our entire system consists of a Pi Camera Module v2, Raspberry Pi 4B with 8GB Ram and 4 motors. Based on tactile input using the 4 motors, the obstacle detection model is highly efficient in helping visually impaired navigate unknown terrain and is designed to be easily reproduced. The paper discusses the utility of a Visual Transformer model in comparison to other CNN based models for this specific application. Through rigorous testing, the proposed obstacle detection model has achieved higher performance on the Common Object in Context (COCO) data set than its CNN counterpart. Comprehensive field tests were conducted to verify the effectiveness of our system for holistic indoor understanding and obstacle avoidance.

翻译：盲目和视觉挑战者在独立航行世界时面临多重问题。其中一些挑战包括寻找通往目的地的最短路线和从远处探测障碍。为解决这一问题,本文件提议ViT Cane, 利用视觉变压器模型来实时发现障碍。我们的整个系统由Picame 模版 v2、 Raspberry Pi 4B 和 8GB Ram 和 4 个马达组成。根据使用4个马达的触觉输入, 障碍探测模型对帮助视力受损者穿越未知地形非常有效,并且设计得容易复制。本文讨论了视觉变压器模型与其他CNN的这一具体应用模型相比的效用。通过严格的测试, 拟议的障碍探测模型在环境通用物体数据集上取得了比CNN的功能更高的绩效。进行了全面的实地测试,以核查我们系统在室内整体理解和避免障碍方面的有效性。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/