Vision plays a crucial role in comprehending the world around us. More than 85% of the external information is obtained through the vision system. It influences our mobility, cognition, information access, and interaction with the environment and other people. Blindness prevents a person from gaining knowledge of the surrounding environment and makes unassisted navigation, object recognition, obstacle avoidance, and reading tasks significant challenges. Many existing systems are often limited by cost and complexity. To help the visually challenged overcome these difficulties faced in everyday life, we propose VisBuddy, a smart assistant to help the visually challenged with their day-to-day activities. VisBuddy is a voice-based assistant where the user can give voice commands to perform specific tasks. It uses the techniques of image captioning for describing the user's surroundings, optical character recognition (OCR) for reading the text in the user's view, object detection to search and find the objects in a room and web scraping to give the user the latest news. VisBuddy has been built by combining the concepts from Deep Learning and the Internet of Things. Thus, VisBuddy serves as a cost-efficient, powerful, all-in-one assistant for the visually challenged by helping them with their day-to-day activities.
翻译:视觉在理解我们周围的世界方面发挥着关键的作用。 超过85%的外部信息是通过视觉系统获得的。 它影响我们的流动、 认知、 信息访问、 与环境和其他人的互动。 盲目妨碍一个人获得周围环境的知识, 并且使一个人在无辅助的导航、 物体识别、 障碍避免和阅读任务上遇到重大挑战。 许多现有系统往往受到成本和复杂性的限制。 为了帮助视觉受挑战的人克服日常生活中面临的这些困难, 我们建议VisBuddy, 是一个聪明的助手, 帮助视觉受挑战的人进行日常活动。 VisBuddy 是一个以声音为基础的助手, 用户可以提供语音指令来执行具体的任务。 它使用图像描述技术, 描述用户周围的描述、 光学字符识别( OCR) 来阅读用户观点中的文本, 进行物体探测, 在房间和网络上搜寻对象, 以便给用户提供最新消息。 VisBuddy 已经通过将深学习和互联网上的东西的概念结合起来而建立起来。 因此, VisBuddy 是一个具有成本效益、 强大、 和视觉上受挑战的日常辅助活动。