In this paper, we define the task of gender rewriting in contexts involving two users (I and/or You) - first and second grammatical persons with independent grammatical gender preferences. We focus on Arabic, a gender-marking morphologically rich language. We develop a multi-step system that combines the positive aspects of both rule-based and neural rewriting models. Our results successfully demonstrate the viability of this approach on a recently created corpus for Arabic gender rewriting, achieving 88.42 M2 F0.5 on a blind test set. Our proposed system improves over previous work on the first-person-only version of this task, by 3.05 absolute increase in M2 F0.5. We demonstrate a use case of our gender rewriting system by using it to post-edit the output of a commercial MT system to provide personalized outputs based on the users' grammatical gender preferences. We make our code, data, and models publicly available.
翻译:在本文中,我们在两个用户(I和/或You) -- -- 具有独立的语法性别偏好的第一和第二语法人员 -- -- 的背景下界定了性别重写任务;我们侧重于阿拉伯语,这是一种具有性别特征的丰富语言;我们开发了一个多步骤系统,将基于规则的和神经重写模式的积极方面结合起来;我们的成果成功地证明了这一方法对于最近创建的阿拉伯文性别重写程序的可行性,在一套盲选中实现了88.42 M2 F0.5;我们提议的系统比以前关于这一任务的第一人版的工作改进了3.05,M2 F0.5绝对增加;我们展示了我们性别重写系统的使用情况,即利用它来编辑商业MT系统的产出,以用户的语法性别偏好为基础提供个性化的产出;我们公布了我们的代码、数据和模型。