Approaches for the stance classification task, an important task for understanding argumentation in debates and detecting fake news, have been relying on models which deal with individual debate topics. In this paper, in order to train a system independent from topics, we propose a new method to extract data with silver labels from raw text to finetune a model for stance classification. The extraction relies on specific discourse relation information, which is shown as a reliable and accurate source for providing stance information. We also propose a 3-stage training framework where the noisy level in the data used for finetuning decreases over different stages going from the most noisy to the least noisy. Detailed experiments show that the automatically annotated dataset as well as the 3-stage training help improve model performance in stance classification. Our approach ranks 1st among 26 competing teams in the stance classification track of the NLPCC 2021 shared task Argumentative Text Understanding for AI Debater, which confirms the effectiveness of our approach.
翻译:姿态分类任务是了解辩论中的争论和探测假新闻的一个重要任务,它一直依赖处理个别辩论主题的模式。在本文件中,为了培训一个独立于专题的系统,我们建议采用新方法,从原始文本中提取银标签数据,对姿态分类模式进行微调。提取信息依靠具体的谈话关系信息,这显示是提供姿态信息的一个可靠和准确的来源。我们还提议了一个三阶段培训框架,在这个框架内,用于微调数据中的噪音水平从最吵闹到最吵的各阶段下降。详细实验显示,自动附加说明的数据集和三阶段培训有助于改进姿态分类的示范性能。我们的方法在2021年国家联络中心立场分类轨上的26个竞合小组中排名第1位,这证实了我们的方法的有效性。