COSIM: 反事实场景想象的常识理由 (CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination)

As humans, we can modify our assumptions about a scene by imagining alternative objects or concepts in our minds. For example, we can easily anticipate the implications of the sun being overcast by rain clouds (e.g., the street will get wet) and accordingly prepare for that. In this paper, we introduce a new task/dataset called Commonsense Reasoning for Counterfactual Scene Imagination (CoSIm) which is designed to evaluate the ability of AI systems to reason about scene change imagination. In this task/dataset, models are given an image and an initial question-response pair about the image. Next, a counterfactual imagined scene change (in textual form) is applied, and the model has to predict the new response to the initial question based on this scene change. We collect 3.5K high-quality and challenging data instances, with each instance consisting of an image, a commonsense question with a response, a description of a counterfactual change, a new response to the question, and three distractor responses. Our dataset contains various complex scene change types (such as object addition/removal/state change, event description, environment change, etc.) that require models to imagine many different scenarios and reason about the changed scenes. We present a baseline model based on a vision-language Transformer (i.e., LXMERT) and ablation studies. Through human evaluation, we demonstrate a large human-model performance gap, suggesting room for promising future work on this challenging counterfactual, scene imagination task. Our code and dataset are publicly available at: https://github.com/hyounghk/CoSIm

翻译：作为人类,我们可以通过想象大脑中的替代物体或概念来修改我们对场景的假设。例如,我们可以很容易地预测太阳被雨云遮盖(例如,街道会变得湿润)并据此准备。在本文中,我们引入了一个新的任务/数据集,名为“反事实场景想象力常识理由”(COSIm),旨在评价AI系统了解场景变化想象力的能力。在这个任务/数据集中,模型被给一个图像和关于图像的初步问答配对。接下来,我们可以应用一个反事实想象的场景变化(如文字形式)的影响,模型必须预测根据场景变化对最初问题作出的新反应。我们收集了3.5K高品质和具有挑战性的数据实例,每例都包含一个图像、一个常见问题,一个反事实变化描述,一个问题的新反应,一个对问题的新反应,以及三个具有挑战性的反应。我们的数据集包含各种复杂的场景变化类型(例如:对象/移动/状态变化,事件描述,一个基于环境的模型,一个不同的模型,一个基于我们模型的模型的模型,一个模型,一个不同的环境变化,一个不同的模型。展示一个我们未来的模型的模型。一个基于一个模型的模型的模型的模型。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

专知会员服务

7+阅读 · 2022年3月19日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日