Subjects change frequently in moderated debates with several participants, such as in parliamentary sessions, electoral debates, and trials. Partitioning a debate into blocks with the same subject is essential for understanding. Often a moderator is responsible for defining when a new block begins so that the task of automatically partitioning a moderated debate can focus solely on the moderator's behavior. In this paper, we (i) propose a new algorithm, DEBACER, which partitions moderated debates; (ii) carry out a comparative study between conventional and BERTimbau pipelines; and (iii) validate DEBACER applying it to the minutes of the Assembly of the Republic of Portugal. Our results show the effectiveness of DEBACER. Keywords: Natural Language Processing, Political Documents, Spoken Text Processing, Speech Split, Dialogue Partitioning.
翻译:在与若干与会者举行的有组织辩论中,如在议会会议、选举辩论和审判中,主题经常改变。将辩论分为同一主题的区块对于理解至关重要。通常,主持人负责确定新区块何时开始,以便自动分割有组织辩论的任务能够完全集中于主持人的行为。在本文件中,我们(一) 提出一个新的算法,即DEBACER,将有组织辩论分开;(二) 开展常规和BERTimbau输油管线之间的比较研究;(三) 验证DEBACER,将其应用于葡萄牙共和国的会议记录。我们的结果显示了DEBACER的有效性。关键词:自然语言处理、政治文件、口头文本处理、语言分解、对话分割。