Being able to express our thoughts, feelings, and ideas to one another is essential for human survival and development. A considerable portion of the population encounters communication obstacles in environments where hearing is the primary means of communication, leading to unfavorable effects on daily activities. An autonomous sign language recognition system that works effectively can significantly reduce this barrier. To address the issue, we proposed a large scale dataset namely Multi-View Bangla Sign Language dataset (MV- BSL) which consist of 115 glosses and 350 isolated words in 15 different categories. Furthermore, We have built a recurrent neural network (RNN) with attention based bidirectional gated recurrent units (Bi-GRU) architecture that models the temporal dynamics of the pose information of an individual communicating through sign language. Human pose information, which has proven effective in analyzing sign pattern as it ignores people's body appearance and environmental information while capturing the true movement information makes the proposed model simpler and faster with state-of-the-art accuracy.
翻译:为了能够相互表达我们的思想、感情和想法,对于人类的生存和发展至关重要。相当一部分人口在听力是主要通信手段的环境中遇到通信障碍,导致对日常活动产生不利的影响。一个能有效发挥作用的自主手语识别系统可以大大减少这一障碍。为了解决这个问题,我们提议建立一个大型数据集,即多维孟加拉手语数据集(MV-BSL),由15个不同类别的115个光条和350个孤立单词组成。此外,我们建立了一个经常性神经网络(RNN),其关注基础是双向门式常规单元(Bi-GRU),它模拟个人通过手语传递信息的时间动态。人造信息被证明是有效的,因为它忽视了人们身体的外观和环境信息,同时捕捉了真正的移动信息,使得拟议的模型更简单、更快,并具有最新准确性。