OpenHands:使手语承认手语承认能够使用基于波斯的预先培训的跨语言模式 (OpenHands: Making Sign Language Recognition Accessible with Pose-based Pretrained Models across Languages)

AI technologies for Natural Languages have made tremendous progress recently. However, commensurate progress has not been made on Sign Languages, in particular, in recognizing signs as individual words or as complete sentences. We introduce OpenHands, a library where we take four key ideas from the NLP community for low-resource languages and apply them to sign languages for word-level recognition. First, we propose using pose extracted through pretrained models as the standard modality of data to reduce training time and enable efficient inference, and we release standardized pose datasets for 6 different sign languages - American, Argentinian, Chinese, Greek, Indian, and Turkish. Second, we train and release checkpoints of 4 pose-based isolated sign language recognition models across all 6 languages, providing baselines and ready checkpoints for deployment. Third, to address the lack of labelled data, we propose self-supervised pretraining on unlabelled data. We curate and release the largest pose-based pretraining dataset on Indian Sign Language (Indian-SL). Fourth, we compare different pretraining strategies and for the first time establish that pretraining is effective for sign language recognition by demonstrating (a) improved fine-tuning performance especially in low-resource settings, and (b) high crosslingual transfer from Indian-SL to few other sign languages. We open-source all models and datasets in OpenHands with a hope that it makes research in sign languages more accessible, available here at https://github.com/AI4Bharat/OpenHands .

翻译：自然语言的AI技术最近取得了巨大的进展。然而,在手语方面,特别是在确认作为单词或完整句子的标志方面,并没有取得相应的进展。我们引入了OpenHands,这是一个图书馆,我们从国家语言方案社区中为低资源语言接收四个关键想法,并将其应用于手语,以达到字级识别。首先,我们提议使用预先培训模型作为标准数据模式,以配置配置成像,以减少培训时间和有效推断,我们发布6种不同手语的标准化数据集,特别是识别单词或完整句。第二,我们对所有6种语言的4个基于布局的孤立手语识别模型进行培训和发布。第二,我们为所有6种语言的基于布局的孤立手语识别模型提供培训和发布检查,提供基线和随时可以部署的检查点。第三,为了解决缺乏贴标签数据的问题,我们提议对无标签数据进行自我监督的预先培训。我们整理和发布最大的基于布局的预先培训数据集(印度语)。第四,我们比较不同的培训前战略,并首次确定培训前培训是有效的语言标识识别识别工具的有效,为此,我们在这里演示(a-breal-bral-leal-al-al-lementalal-legresseruserves)。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/