Reasoning is one of the major challenges of Human-like AI and has recently attracted intensive attention from natural language processing (NLP) researchers. However, cross-modal reasoning needs further research. For cross-modal reasoning, we observe that most methods fall into shallow feature matching without in-depth human-like reasoning.The reason lies in that existing cross-modal tasks directly ask questions for a image. However, human reasoning in real scenes is often made under specific background information, a process that is studied by the ABC theory in social psychology. We propose a shared task named "Premise-based Multimodal Reasoning" (PMR), which requires participating models to reason after establishing a profound understanding of background information. We believe that the proposed PMR would contribute to and help shed a light on human-like in-depth reasoning.
翻译:理由是像人类一样的人工智能的主要挑战之一,最近引起了自然语言处理(NLP)研究人员的高度关注。然而,交叉模式推理需要进一步研究。对于交叉模式推理,我们发现,大多数方法都属于浅色特征匹配,而没有深入的人类相似推理。原因在于现有的交叉模式任务直接要求图像问题。然而,真实场面上的人类推理往往在具体的背景资料下进行,这是由ABC社会心理学理论研究的一个过程。我们提议了一项共同任务,名为“基于预先的多模式推理 ” (PMR),这要求在深入理解背景资料后,参与模式才能理性地理解。我们认为,拟议的多模式将有助于和有助于揭示人性的深入推理。