Bangladeshi Sign Language (BdSL) is a commonly used medium of communication for the hearing-impaired people in Bangladesh. A real-time BdSL interpreter with no controlled lab environment has a broad social impact and an interesting avenue of research as well. Also, it is a challenging task due to the variation in different subjects (age, gender, color, etc.), complex features, and similarities of signs and clustered backgrounds. However, the existing dataset for BdSL classification task is mainly built in a lab friendly setup which limits the application of powerful deep learning technology. In this paper, we introduce a dataset named BdSL36 which incorporates background augmentation to make the dataset versatile and contains over four million images belonging to 36 categories. Besides, we annotate about 40,000 images with bounding boxes to utilize the potentiality of object detection algorithms. Furthermore, several intensive experiments are performed to establish the baseline performance of our BdSL36. Moreover, we employ beta testing of our classifiers at the user level to justify the possibilities of real-world application with this dataset. We believe our BdSL36 will expedite future research on practical sign letter classification. We make the datasets and all the pre-trained models available for further researcher.
翻译:孟加拉国手语(BdSL)是孟加拉国听力残障人士常用的沟通媒介。 一个没有受控制的实验室环境的实时 BdSL 翻译员具有广泛的社会影响,也是令人感兴趣的研究途径。 另外,由于不同科目(年龄、性别、肤色等)、复杂特征、标志和组合背景的相似性等不同,这是一个具有挑战性的任务。然而,现有的BdSL分类任务数据集主要建在一个实验室友好的设置中,这限制了强大的深层学习技术的应用。在本文中,我们引入了一个名为 BdSL36的数据集,其中包含背景增强功能,使数据集具有可操作性,并包含属于36类的400多万图像。此外,我们注意到大约4万个图像与捆绑框,以利用物体探测算法的潜力。此外,为了确定我们的BdSL36的基线性能,我们还进行了多次密集实验。此外,我们在用户一级对我们的分类师进行了乙级测试,以证明使用这一数据集进行现实世界应用的可能性。我们相信,我们的BdSL36将加快未来对实际签名文件分类的研究。我们进行了所有的数据设置前的模型都经过了检验。