Argument Unit Recognition and Classification aims at identifying argument units from text and classifying them as pro or against. One of the design choices that need to be made when developing systems for this task is what the unit of classification should be: segments of tokens or full sentences. Previous research suggests that fine-tuning language models on the token-level yields more robust results for classifying sentences compared to training on sentences directly. We reproduce the study that originally made this claim and further investigate what exactly token-based systems learned better compared to sentence-based ones. We develop systematic tests for analysing the behavioural differences between the token-based and the sentence-based system. Our results show that token-based models are generally more robust than sentence-based models both on manually perturbed examples and on specific subpopulations of the data.
翻译:参数单位的识别和分类旨在从文本中找出参数单位,并将其分为赞成或反对的。在为这项任务建立系统时,需要做出的设计选择之一是分类单位应该是什么:象征性部分或完整句子。以前的研究表明,与直接进行判刑培训相比,对象征性级别的语言模型进行微调,在对判决进行分类方面产生更强有力的结果。我们转载了最初提出这一主张的研究报告,并进一步调查与基于判刑的系统相比,基于象征性的系统所学到的比基于判刑的系统更好的是什么。我们为分析基于象征性的系统和基于判刑的系统之间的行为差异进行了系统测试。我们的结果显示,基于象征性的模型一般比基于判刑的模型更强大,既包括人工渗透的实例,也包括数据的具体子群。