Raven's Progressive Matrices (RPMs) are frequently used in testing human's visual reasoning ability. Recent advances of RPM-like datasets and solution models partially address the challenges of visually understanding the RPM questions and logically reasoning the missing answers. In view of the poor generalization performance due to insufficient samples in RPM datasets, we propose an effective scheme, namely Candidate Answer Morphological Mixup (CAM-Mix). CAM-Mix serves as a data augmentation strategy by gray-scale image morphological mixup, which regularizes various solution methods and overcomes the model overfitting problem. By creating new negative candidate answers semantically similar to the correct answers, a more accurate decision boundary could be defined. By applying the proposed data augmentation method, a significant and consistent performance improvement is achieved on various RPM-like datasets compared with the state-of-the-art models.
翻译:雷文的累进矩阵(RPM)经常用于测试人的视觉推理能力。RPM类数据集和解决方案模型的最近进展部分地解决了视觉理解RPM问题和对缺失答案进行逻辑推理的挑战。鉴于RPM数据集的样本不足,我们提出了一个有效的计划,即“候选解答物质混合(CAM-Mix) ” (CAM-Mix ) 。 CAM-Mix 是一个通过灰度图像形态变形来强化数据的战略,它规范了各种解决方案,克服了模型的过度配置问题。通过创建与正确答案相似的新的负面候选答案,可以界定一个更准确的决定界限。通过应用拟议的数据增强方法,在与最新模型相比的各种类似RPM(RPM)的数据集上取得了显著和一致的改进。