This paper explores the application of T5 models for Saudi Sign Language (SSL) translation using a novel dataset. The SSL dataset includes three challenging testing protocols, enabling comprehensive evaluation across different scenarios. Additionally, it captures unique SSL characteristics, such as face coverings, which pose challenges for sign recognition and translation. In our experiments, we investigate the impact of pre-training on American Sign Language (ASL) data by comparing T5 models pre-trained on the YouTubeASL dataset with models trained directly on the SSL dataset. Experimental results demonstrate that pre-training on YouTubeASL significantly improves models' performance (roughly $3\times$ in BLEU-4), indicating cross-linguistic transferability in sign language models. Our findings highlight the benefits of leveraging large-scale ASL data to improve SSL translation and provide insights into the development of more effective sign language translation systems. Our code is publicly available at our GitHub repository.
翻译:本文探讨了T5模型在沙特手语翻译中的应用,并引入了一个新颖的数据集。该SSL数据集包含三种具有挑战性的测试协议,能够实现不同场景下的全面评估。此外,数据集还捕捉了沙特手语的独特特征(如面部遮盖物),这些特征对手语识别与翻译构成了特殊挑战。实验中,我们通过比较在YouTubeASL数据集上预训练的T5模型与直接在SSL数据集上训练的模型,研究了美国手语数据预训练的影响。实验结果表明,基于YouTubeASL的预训练显著提升了模型性能(BLEU-4指标提升约$3\times$),这揭示了手语模型存在跨语言可迁移性。我们的研究结果凸显了利用大规模ASL数据改进SSL翻译的益处,并为开发更有效的手语翻译系统提供了见解。相关代码已在GitHub仓库开源。