Sign Language Processing (SLP) is an interdisciplinary field comprised of Natural Language Processing (NLP) and Computer Vision. It is focused on the computational understanding, translation, and production of signed languages. Traditional approaches have often been constrained by the use of gloss-based systems that are both language-specific and inadequate for capturing the multidimensional nature of sign language. These limitations have hindered the development of technology capable of processing signed languages effectively. This thesis aims to revolutionize the field of SLP by proposing a simple paradigm that can bridge this existing technological gap. We propose the use of SignWiring, a universal sign language transcription notation system, to serve as an intermediary link between the visual-gestural modality of signed languages and text-based linguistic representations. We contribute foundational libraries and resources to the SLP community, thereby setting the stage for a more in-depth exploration of the tasks of sign language translation and production. These tasks encompass the translation of sign language from video to spoken language text and vice versa. Through empirical evaluations, we establish the efficacy of our transcription method as a pivot for enabling faster, more targeted research, that can lead to more natural and accurate translations across a range of languages. The universal nature of our transcription-based paradigm also paves the way for real-time, multilingual applications in SLP, thereby offering a more inclusive and accessible approach to language technology. This is a significant step toward universal accessibility, enabling a wider reach of AI-driven language technologies to include the deaf and hard-of-hearing community.
翻译:暂无翻译