Multimodal learning has achieved great successes in many scenarios. Compared with unimodal learning, it can effectively combine the information from different modalities to improve the performance of learning tasks. In reality, the multimodal data may have missing modalities due to various reasons, such as sensor failure and data transmission error. In previous works, the information of the modality-missing data has not been well exploited. To address this problem, we propose an efficient approach based on maximum likelihood estimation to incorporate the knowledge in the modality-missing data. Specifically, we design a likelihood function to characterize the conditional distribution of the modality-complete data and the modality-missing data, which is theoretically optimal. Moreover, we develop a generalized form of the softmax function to effectively implement maximum likelihood estimation in an end-to-end manner. Such training strategy guarantees the computability of our algorithm capably. Finally, we conduct a series of experiments on real-world multimodal datasets. Our results demonstrate the effectiveness of the proposed approach, even when 95% of the training data has missing modality.
翻译:多模式学习在许多设想中取得了巨大成功。 与单模式学习相比,它可以有效地将不同模式的信息结合起来,以改善学习任务的业绩。 事实上,多式联运数据可能由于传感器故障和数据传输错误等各种原因而缺少模式。 在以往的著作中,模式缺失数据的信息没有得到充分利用。为了解决这一问题,我们建议了基于最大可能性的高效方法,将知识纳入模式缺失数据。具体地说,我们设计一种可能性功能,以说明模式完整数据和模式流出数据有条件分布的特性,这是理论上最理想的。此外,我们开发了软体功能的普遍形式,以有效采用端到端方式进行最大可能性估算。这种培训战略保证了我们算法的兼容性。最后,我们针对现实世界的多式联运数据集进行了一系列实验。我们的结果证明了拟议方法的有效性,即使95%的培训数据缺乏模式。