Damage to the inferior frontal gyrus (Broca's area) can cause agrammatic aphasia wherein patients, although able to comprehend, lack the ability to form complete sentences. This inability leads to communication gaps which cause difficulties in their daily lives. The usage of assistive devices can help in mitigating these issues and enable the patients to communicate effectively. However, due to lack of large scale studies of linguistic deficits in aphasia, research on such assistive technology is relatively limited. In this work, we present two contributions that aim to re-initiate research and development in this field. Firstly, we propose a model that uses linguistic features from small scale studies on aphasia patients and generates large scale datasets of synthetic aphasic utterances from grammatically correct datasets. We show that the mean length of utterance, the noun/verb ratio, and the simple/complex sentence ratio of our synthetic datasets correspond to the reported features of aphasic speech. Further, we demonstrate how the synthetic datasets may be utilized to develop assistive devices for aphasia patients. The pre-trained T5 transformer is fine-tuned using the generated dataset to suggest 5 corrected sentences given an aphasic utterance as input. We evaluate the efficacy of the T5 model using the BLEU and cosine semantic similarity scores. Affirming results with BLEU score of 0.827/1.00 and semantic similarity of 0.904/1.00 were obtained. These results provide a strong foundation for the concept that a synthetic dataset based on small scale studies on aphasia can be used to develop effective assistive technology.
翻译:对低劣的前方陀螺(Broca的面积)的损害可造成语法畸形,使患者虽然能够理解,但缺乏形成完整句子的能力。这种无能导致沟通差距,给其日常生活造成困难。使用辅助装置可以帮助减轻这些问题,使患者能够有效地沟通。然而,由于缺乏对大片语言缺陷的大规模研究,关于这种辅助性技术的研究相对有限。在这项工作中,我们提出了两种贡献,目的是重新启动该领域的研究和开发。首先,我们提出了一个模型,使用对亚马逊病人的小规模概念研究的语言特征,并产生大规模合成的偏斜语谱数据集,导致其日常生活出现困难。我们表明,由于对阿马西亚的语言缺陷进行大规模研究,我们合成数据集的简单/综合的句子比率与所报道的偏差特征相对应。此外,我们展示了如何利用合成数据集来开发对亚马西亚病人进行小规模研究的辅助装置。我们用B5级的精确变压结果来进行精确的变压。我们用B5级变压前的变压数据基础是用来对正数进行精确的计算。