When reading a story, humans can rapidly understand new fictional characters with a few observations, mainly by drawing analogy to fictional and real people they met before in their lives. This reflects the few-shot and meta-learning essence of humans' inference of characters' mental states, i.e., humans' theory-of-mind (ToM), which is largely ignored in existing research. We fill this gap with a novel NLP benchmark, TOM-IN-AMC, the first assessment of models' ability of meta-learning of ToM in a realistic narrative understanding scenario. Our benchmark consists of $\sim$1,000 parsed movie scripts for this purpose, each corresponding to a few-shot character understanding task; and requires models to mimic humans' ability of fast digesting characters with a few starting scenes in a new movie. Our human study verified that humans can solve our problem by inferring characters' mental states based on their previously seen movies; while the state-of-the-art metric-learning and meta-learning approaches adapted to our task lags 30% behind.
翻译:当阅读一个故事时,人类可以通过一些观察迅速理解新的虚构人物,主要是通过类比他们一生中遇到的虚构和真实的人。这反映了人类对人物精神状态的几分和元化推论的精髓,即人类的智力理论(TOM),现有研究基本上忽视了这一点。我们用一个新的NLP基准(TOM-IN-AMC)填补了这一空白,即TOM-IN-AMC(TOM-IN-AMC),这是在现实的叙述性理解情景中首次评估TOM元学习模型的能力。我们的基准包括1 000美元,为此对电影脚本进行评比对,每本都相当于几个截图的字符理解任务;要求模型模拟人类快速消化字符的能力,在新电影中先有几幕。我们的人类研究证实,人类可以通过根据他们以前看到的电影推断人的精神状态来解决我们的问题;同时,根据我们的任务落后了30%,采用最先进的计量和元学习方法。