Most previous studies integrate cognitive language processing signals (e.g., eye-tracking or EEG data) into neural models of natural language processing (NLP) just by directly concatenating word embeddings with cognitive features, ignoring the gap between the two modalities (i.e., textual vs. cognitive) and noise in cognitive features. In this paper, we propose a CogAlign approach to these issues, which learns to align textual neural representations to cognitive features. In CogAlign, we use a shared encoder equipped with a modality discriminator to alternatively encode textual and cognitive inputs to capture their differences and commonalities. Additionally, a text-aware attention mechanism is proposed to detect task-related information and to avoid using noise in cognitive features. Experimental results on three NLP tasks, namely named entity recognition, sentiment analysis and relation extraction, show that CogAlign achieves significant improvements with multiple cognitive features over state-of-the-art models on public datasets. Moreover, our model is able to transfer cognitive information to other datasets that do not have any cognitive processing signals.
翻译:过去的研究大多将认知语言处理信号(如眼睛跟踪或 EEG 数据)纳入自然语言处理神经模型(NLP),只是直接将含有认知特征的字嵌入文字,忽略两种模式(即文字对认知特征)和认知特征中的噪音之间的差距。在本文件中,我们建议对这些问题采取认知法方法,学会将文字神经表现与认知特征相协调。在 CogAlign 中,我们使用一个共用编码编码编码编码编码编码器,以替代对文本和认知输入的区分和共性。此外,还提议了一个文本识别注意机制,以探测任务相关信息,避免在认知特征中使用噪音。三项NLP任务的实验结果,即名称实体识别、情绪分析和关系提取,表明CogAlign在公共数据集中,与最先进的模型相比,具有多种认知特征的重大改进。此外,我们的模型能够将认知信息转移到没有认知处理信号的其他数据集。