What would it take for a natural language model to understand a novel, such as The Lord of the Rings? Among other things, such a model must be able to: (a) identify and record new characters (entities) and their attributes as they are introduced in the text, and (b) identify subsequent references to the characters previously introduced and update their attributes. This problem of entity tracking is essential for language understanding, and thus, useful for a wide array of downstream applications in NLP such as question-answering, summarization. In this thesis, we focus on two key problems in relation to facilitating the use of entity tracking models: (i) scaling entity tracking models to long documents, such as a novel, and (ii) integrating entity tracking into language models. Applying language technologies to long documents has garnered interest recently, but computational constraints are a significant bottleneck in scaling up current methods. In this thesis, we argue that computationally efficient entity tracking models can be developed by representing entities with rich, fixed-dimensional vector representations derived from pretrained language models, and by exploiting the ephemeral nature of entities. We also argue for the integration of entity tracking into language models as it will allow for: (i) wider application given the current ubiquitous use of pretrained language models in NLP applications, and (ii) easier adoption since it is much easier to swap in a new pretrained language model than to integrate a separate standalone entity tracking model.
翻译:自然语言模型需要什么才能理解小说,比如环环之王?除其他外,这样的模型必须能够:(a) 识别和记录在文本中引入的新字符(实体)及其属性,以及(b) 确定随后对先前引入的字符的引用,并更新其属性。实体跟踪问题对于语言理解至关重要,因此,对于NLP的一系列广泛的下游应用,例如问答、总结等非常有用。在这个论文中,我们集中关注两个与促进使用实体跟踪模型有关的关键问题:(一) 将实体跟踪模型推广到长文件(例如新书),以及(二) 将实体跟踪跟踪到语言模型中。最近,将语言技术应用到长文件中引起了兴趣,但计算限制是扩大当前方法的一个重大瓶颈。在这个论文中,我们认为,可以通过代表具有来自预先培训的语言模型的丰富、固定的矢量表达方式的实体,以及利用实体的简便性(例如新书等) 将实体跟踪模型纳入语言模型,从而形成计算高效的实体跟踪模型。我们还认为,在采用新的模型之后,实体的整合将大量使用新的模型。