Traditionally, a debate usually requires a manual preparation process, including reading plenty of articles, selecting the claims, identifying the stances of the claims, seeking the evidence for the claims, etc. As the AI debate attracts more attention these years, it is worth exploring the methods to automate the tedious process involved in the debating system. In this work, we introduce a comprehensive and large dataset named IAM, which can be applied to a series of argument mining tasks, including claim extraction, stance classification, evidence extraction, etc. Our dataset is collected from over 1k articles related to 123 topics. Near 70k sentences in the dataset are fully annotated based on their argument properties (e.g., claims, stances, evidence, etc.). We further propose two new integrated argument mining tasks associated with the debate preparation process: (1) claim extraction with stance classification (CESC) and (2) claim-evidence pair extraction (CEPE). We adopt a pipeline approach and an end-to-end method for each integrated task separately. Promising experimental results are reported to show the values and challenges of our proposed tasks, and motivate future research on argument mining.
翻译:传统上,辩论通常需要一个手工准备过程,包括阅读大量文章,选择索赔要求,确定索赔要求的立场,寻找索赔要求的证据,等等。由于大赦国际的辩论在这些年引起更多的注意,因此值得探讨辩论系统所涉的繁琐过程自动化的方法。在这项工作中,我们采用一个全面和庞大的数据集,名为IMAM,可适用于一系列有争议的采矿任务,包括索赔提取、立场分类、证据提取等。我们的数据集是从与123个专题有关的1k多篇文章中收集的。数据集中近70公里的句子根据它们的论点性质(例如索赔、立场、证据等)作了充分说明。我们进一步提出了与辩论准备过程有关的两项新的综合辩论采矿任务:(1) 采用立场分类(CESC)的索赔提取和(2) 索赔证据对夫妇的提取(CEPEE)。我们为每一项综合任务分别采用了管道方法和端到端方法。我们报告的实验结果显示了我们拟议的任务的价值和挑战,并激励今后关于辩论采矿的研究。