We leverage different context windows when predicting the emotion of different utterances. New modules are included to realize variable-length context: 1) two speaker-aware units, which explicitly model inner- and inter-speaker dependencies to form distilled conversational context, and 2) a top-k normalization layer, which determines the most proper context windows from the conversational context to predict emotion. Experiments and ablation studies show that our approach outperforms several strong baselines on three public datasets.
翻译:在预测不同语句的情绪时,我们利用不同的上下文窗口。新模块包含新的模块,以实现变长环境:1)两个有声器意识的单元,这些单元明确地模拟了内和声器间依赖性,形成蒸馏式的谈话环境;2)一个顶层的正常化层,它决定了从谈话环境到预测情感的最合适的上下文窗口。实验和通货膨胀研究表明,我们的方法在三个公共数据集上超过了几个强有力的基线。