Language models can be trained to recognize the moral sentiment of text, creating new opportunities to study the role of morality in human life. As interest in language and morality has grown, several ground truth datasets with moral annotations have been released. However, these datasets vary in the method of data collection, domain, topics, instructions for annotators, etc. Simply aggregating such heterogeneous datasets during training can yield models that fail to generalize well. We describe a data fusion framework for training on multiple heterogeneous datasets that improve performance and generalizability. The model uses domain adversarial training to align the datasets in feature space and a weighted loss function to deal with label shift. We show that the proposed framework achieves state-of-the-art performance in different datasets compared to prior works in morality inference.
翻译:语言模型可以进行文本道德情感识别,从而创造了研究道德在人类生活中作用的新机会。随着对语言和道德的兴趣不断增长,已发布了几个具有道德注释的基准数据集。然而,这些数据集在数据收集方法、领域、主题、注释人员指令等方面存在差异。简单地聚合这样的异构性数据集进行训练可能会产生无法很好地泛化的模型。我们描述了一个数据融合框架,可以在训练多个异构数据集时提高性能和泛化能力。该模型使用领域对抗训练来在特征空间中对齐数据集,并使用加权损失函数处理标签偏移。我们展示了所提出的框架在不同数据集上取得了比德道先前道德推断方面更好的性能。