For a large portion of real-life utterances, the intention cannot be solely decided by either their semantic or syntactic characteristics. Although not all the sociolinguistic and pragmatic information can be digitized, at least phonetic features are indispensable in understanding the spoken language. Especially in head-final languages such as Korean, sentence-final prosody has great importance in identifying the speaker's intention. This paper suggests a system which identifies the inherent intention of a spoken utterance given its transcript, in some cases using auxiliary acoustic features. The main point here is a separate distinction for cases where discrimination of intention requires an acoustic cue. Thus, the proposed classification system decides whether the given utterance is a fragment, statement, question, command, or a rhetorical question/command, utilizing the intonation-dependency coming from the head-finality. Based on an intuitive understanding of the Korean language that is engaged in the data annotation, we construct a network which identifies the intention of a speech, and validate its utility with the test sentences. The system, if combined with up-to-date speech recognizers, is expected to be flexibly inserted into various language understanding modules.
翻译:对于大部分实际生活中的言词来说,意图不能完全由他们的语义或综合特征来决定。虽然并非所有的社会语言和实用信息都可以数字化,但至少语音特征是理解口语所不可或缺的。特别是在韩语等头等最终语言中,句终决的手势对于确定发言者的意图非常重要。本文建议了一种系统,这种系统可以确定口头言论的内在意图,在某些情况下使用辅助声学特征。这里的要点是区分对意图歧视需要声学提示的情况。因此,拟议的分类系统决定特定言论是零碎、语句、问题、命令还是口头问题/命令,利用头决中产生的内向依赖性。根据对数据注释中使用的韩语的直觉理解,我们建立一个网络,确定讲话的意图,并验证其与测试句的实用性。如果与最新的语音识别者相结合,则该系统预计将灵活地插入各种语言理解模块。