This paper presents work on novel machine translation (MT) systems between spoken and signed languages, where signed languages are represented in SignWriting, a sign language writing system. Our work seeks to address the lack of out-of-the-box support for signed languages in current MT systems and is based on the SignBank dataset, which contains pairs of spoken language text and SignWriting content. We introduce novel methods to parse, factorize, decode, and evaluate SignWriting, leveraging ideas from neural factored MT. In a bilingual setup--translating from American Sign Language to (American) English--our method achieves over 30 BLEU, while in two multilingual setups--translating in both directions between spoken languages and signed languages--we achieve over 20 BLEU. We find that common MT techniques used to improve spoken language translation similarly affect the performance of sign language translation. These findings validate our use of an intermediate text representation for signed languages to include them in natural language processing research.
翻译:本文介绍口语和手语之间新型机器翻译系统的工作,手语是手语书写系统,手语书写系统代表手语。我们的工作旨在解决当前MT系统对手语缺乏箱外支持的问题,并以SignBank数据集为基础,该数据集包含口语文本和手语内容的对口语数据集。我们采用新颖方法,分析、分解、解码和评估手语,利用神经因素MT的思想。在双语设置中,将美国手语转换为(美国)英语-我们的方法实现了30种双双通语言,在口语和经签名语言之间的两个方向上都设置了多语种翻译。我们发现用于改进口语翻译的通用MT技术同样影响手语翻译的绩效。这些结论证实了我们使用签名语言的中间文本表述方式,以便将他们纳入自然语言处理研究。