In real-world dialog systems, the ability to understand the user's emotions and interact anthropomorphically is of great significance. Emotion Recognition in Conversation (ERC) is one of the key ways to accomplish this goal and has attracted growing attention. How to model the context in a conversation is a central aspect and a major challenge of ERC tasks. Most existing approaches struggle to adequately incorporate both global and local contextual information, and their network structures are overly sophisticated. For this reason, we propose a simple and effective Dual-stream Recurrence-Attention Network (DualRAN), which is based on Recurrent Neural Network (RNN) and Multi-head ATtention network (MAT). DualRAN eschews the complex components of current methods and focuses on combining recurrence-based methods with attention-based ones. DualRAN is a dual-stream structure mainly consisting of local- and global-aware modules, modeling a conversation simultaneously from distinct perspectives. In addition, we develop two single-stream network variants for DualRAN, i.e., SingleRANv1 and SingleRANv2. According to the experimental findings, DualRAN boosts the weighted F1 scores by 1.43% and 0.64% on the IEMOCAP and MELD datasets, respectively, in comparison to the strongest baseline. On two other datasets (i.e., EmoryNLP and DailyDialog), our method also attains competitive results.
翻译:暂无翻译