We present the first openly available multimodal metaphor annotated corpus. The corpus consists of videos including audio and subtitles that have been annotated by experts. Furthermore, we present a method for detecting metaphors in the new dataset based on the textual content of the videos. The method achieves a high F1-score (62\%) for metaphorical labels. We also experiment with other modalities and multimodal methods; however, these methods did not out-perform the text-based model. In our error analysis, we do identify that there are cases where video could help in disambiguating metaphors, however, the visual cues are too subtle for our model to capture. The data is available on Zenodo.
翻译:我们展示了第一个公开的多式联运比喻附加说明材料,该文集由包括专家附加说明的音频和字幕在内的视频组成,此外,我们还展示了一种根据视频文字内容在新数据集中探测隐喻的方法,该方法在隐喻标签方面达到了高F1-分数(62 ⁇ ),我们还试验了其他模式和多式联运方法;然而,这些方法没有超越基于文本的模式。在我们的错误分析中,我们确实发现有些视频可以帮助混淆隐喻,然而,视觉提示太微妙,无法在模型中捕捉。这些数据可以在Zenodo上找到。