Relation extraction typically aims to extract semantic relationships between entities from the unstructured text. One of the most essential data sources for relation extraction is the spoken language, such as interviews and dialogues. However, the error propagation introduced in automatic speech recognition (ASR) has been ignored in relation extraction, and the end-to-end speech-based relation extraction method has been rarely explored. In this paper, we propose a new listening information extraction task, i.e., speech relation extraction. We construct the training dataset for speech relation extraction via text-to-speech systems, and we construct the testing dataset via crowd-sourcing with native English speakers. We explore speech relation extraction via two approaches: the pipeline approach conducting text-based extraction with a pretrained ASR module, and the end2end approach via a new proposed encoder-decoder model, or what we called SpeechRE. We conduct comprehensive experiments to distinguish the challenges in speech relation extraction, which may shed light on future explorations. We share the code and data on https://github.com/wutong8023/SpeechRE.
翻译:287. 本文中,我们提议一项新的监听信息提取任务,即语音关系提取。我们为通过文本到语音系统提取语音关系而建立培训数据集,我们与当地英语发言人通过众包建立测试数据集。我们通过两种方法探索语音关系提取:即用预先培训的ASR模块进行基于文本提取的管道方法,以及通过新的拟议编码-解密模式或我们称之为SpeaterRE的终端2终端方法。我们进行了全面实验,以区分语音关系提取方面的挑战,这可能会为今后的探索提供线索。我们分享了https://github.com/wutong8023/SpeechRE的代码和数据。