Modern Review Helpfulness Prediction systems are dependent upon multiple modalities, typically texts and images. Unfortunately, those contemporary approaches pay scarce attention to polish representations of cross-modal relations and tend to suffer from inferior optimization. This might cause harm to model's predictions in numerous cases. To overcome the aforementioned issues, we propose Multimodal Contrastive Learning for Multimodal Review Helpfulness Prediction (MRHP) problem, concentrating on mutual information between input modalities to explicitly elaborate cross-modal relations. In addition, we introduce Adaptive Weighting scheme for our contrastive learning approach in order to increase flexibility in optimization. Lastly, we propose Multimodal Interaction module to address the unalignment nature of multimodal data, thereby assisting the model in producing more reasonable multimodal representations. Experimental results show that our method outperforms prior baselines and achieves state-of-the-art results on two publicly available benchmark datasets for MRHP problem.
翻译:现代审查帮助预测系统取决于多种模式,典型的文本和图像。 不幸的是,这些当代方法很少注意跨模式关系的抛光表现,往往受到低度优化的影响。这可能会在许多情况中损害模型的预测。为了克服上述问题,我们提出多模式审查的多模式对比学习帮助预测(MRHP)问题,侧重于投入模式之间的相互信息,以明确阐述跨模式关系。此外,我们为我们的对比式学习方法引入了适应性加权计划,以提高优化的灵活性。最后,我们提出了多模式互动模块,以解决多式联运数据不匹配的性质,从而帮助模型产生更合理的多模式表述。实验结果显示,我们的方法比以前的基准标准要好,并且通过两个公开的MRHP问题基准数据集实现最新的结果。