In human-human conversations, Context Tracking deals with identifying important entities and keeping track of their properties and relationships. This is a challenging problem that encompasses several subtasks such as slot tagging, coreference resolution, resolving plural mentions and entity linking. We approach this problem as an end-to-end modeling task where the conversational context is represented by an entity repository containing the entity references mentioned so far, their properties and the relationships between them. The repository is updated turn-by-turn, thus making training and inference computationally efficient even for long conversations. This paper lays the groundwork for an investigation of this framework in two ways. First, we release Contrack, a large scale human-human conversation corpus for context tracking with people and location annotations. It contains over 7000 conversations with an average of 11.8 turns, 5.8 entities and 15.2 references per conversation. Second, we open-source a neural network architecture for context tracking. Finally we compare this network to state-of-the-art approaches for the subtasks it subsumes and report results on the involved tradeoffs.
翻译:在人与人的对话中, " 背景跟踪 " 涉及识别重要实体并跟踪其属性和关系,这是一个具有挑战性的问题,它包含多个子任务,如位置标记、共同参考分辨率、解决多元提及和实体连接等。我们将此问题作为一端到端的建模任务,其对话背景由一个实体库代表,该实体库包含迄今提到的实体参考资料、其属性和它们之间的关系。存储库是不断更新的转弯,从而使得培训和推论的计算效率甚至对于长期对话来说都是有效的。本文件为这一框架的调查打下了两个基础。首先,我们发布了Contract,这是一个大规模的人与人的对话程序,用于对人和地点进行背景跟踪,它包含平均11.8转、5.8个实体和每次对话的15.2个参考。第二,我们为背景跟踪而开源的神经网络架构。最后,我们将这个网络与它所子子任务的最新方法进行比较,并报告相关的交易结果。