AI technologies for Natural Languages have made tremendous progress recently. However, commensurate progress has not been made on Sign Languages, in particular, in recognizing signs as individual words or as complete sentences. We introduce OpenHands, a library where we take four key ideas from the NLP community for low-resource languages and apply them to sign languages for word-level recognition. First, we propose using pose extracted through pretrained models as the standard modality of data to reduce training time and enable efficient inference, and we release standardized pose datasets for 6 different sign languages - American, Argentinian, Chinese, Greek, Indian, and Turkish. Second, we train and release checkpoints of 4 pose-based isolated sign language recognition models across all 6 languages, providing baselines and ready checkpoints for deployment. Third, to address the lack of labelled data, we propose self-supervised pretraining on unlabelled data. We curate and release the largest pose-based pretraining dataset on Indian Sign Language (Indian-SL). Fourth, we compare different pretraining strategies and for the first time establish that pretraining is effective for sign language recognition by demonstrating (a) improved fine-tuning performance especially in low-resource settings, and (b) high crosslingual transfer from Indian-SL to few other sign languages. We open-source all models and datasets in OpenHands with a hope that it makes research in sign languages more accessible, available here at https://github.com/AI4Bharat/OpenHands .
翻译:自然语言的AI技术最近取得了巨大的进展。然而,在手语方面,特别是在确认作为单词或完整句子的标志方面,并没有取得相应的进展。我们引入了OpenHands,这是一个图书馆,我们从国家语言方案社区中为低资源语言接收四个关键想法,并将其应用于手语,以达到字级识别。首先,我们提议使用预先培训模型作为标准数据模式,以配置配置成像,以减少培训时间和有效推断,我们发布6种不同手语的标准化数据集,特别是识别单词或完整句。第二,我们对所有6种语言的4个基于布局的孤立手语识别模型进行培训和发布。第二,我们为所有6种语言的基于布局的孤立手语识别模型提供培训和发布检查,提供基线和随时可以部署的检查点。第三,为了解决缺乏贴标签数据的问题,我们提议对无标签数据进行自我监督的预先培训。我们整理和发布最大的基于布局的预先培训数据集(印度语)。第四,我们比较不同的培训前战略,并首次确定培训前培训是有效的语言标识识别识别工具的有效,为此,我们在这里演示(a-breal-bral-leal-al-al-lementalal-legresseruserves)。