TraVLR:现在你看到了,现在你看不到了! 评估维西奥-语言理性的跨模式传输 (TraVLR: Now You See It, Now You Don't! Evaluating Cross-Modal Transfer of Visio-Linguistic Reasoning)

Numerous visio-linguistic (V+L) representation learning methods have been developed, yet existing datasets do not evaluate the extent to which they represent visual and linguistic concepts in a unified space. Inspired by the crosslingual transfer and psycholinguistics literature, we propose a novel evaluation setting for V+L models: zero-shot cross-modal transfer. Existing V+L benchmarks also often report global accuracy scores on the entire dataset, rendering it difficult to pinpoint the specific reasoning tasks that models fail and succeed at. To address this issue and enable the evaluation of cross-modal transfer, we present TraVLR, a synthetic dataset comprising four V+L reasoning tasks. Each example encodes the scene bimodally such that either modality can be dropped during training/testing with no loss of relevant information. TraVLR's training and testing distributions are also constrained along task-relevant dimensions, enabling the evaluation of out-of-distribution generalisation. We evaluate four state-of-the-art V+L models and find that although they perform well on the test set from the same modality, all models fail to transfer cross-modally and have limited success accommodating the addition or deletion of one modality. In alignment with prior work, we also find these models to require large amounts of data to learn simple spatial relationships. We release TraVLR as an open challenge for the research community.

翻译：已经开发了许多语言(V+L)代表制学习方法,但现有的数据集并没有评估它们代表统一空间的视觉和语言概念的程度。在跨语言传输和心理语言学文献的启发下,我们提议为V+L模型建立一个新的评价环境:零弹跨模式转让。现有的V+L基准还经常报告整个数据集的全球准确度分数,从而难以确定模型失败和成功的具体推理任务。为了解决这一问题,并能够评价跨模式转让,我们提出了一个由四种V+L推理任务组成的合成数据集。每个示例都用双式双式编码说明,在培训/测试期间,两种模式都可以在不丢失相关信息的情况下放弃。TraVLLR的培训和测试分布也与任务相关,使得能够评估分配外的通用性评估。我们评估了四种最先进的V+L模型,并发现尽管它们从同一模式的测试集中运行得很好,但所有模型都无法将跨模式转移,在培训/测试过程中,并且没有丢失相关信息。TraVLLR的培训和测试分布也限制了我们之前的大规模学习模式。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/