Cross-modal recipe retrieval has attracted research attention in recent years, thanks to the availability of large-scale paired data for training. Nevertheless, obtaining adequate recipe-image pairs covering the majority of cuisines for supervised learning is difficult if not impossible. By transferring knowledge learnt from a data-rich cuisine to a data-scarce cuisine, domain adaptation sheds light on this practical problem. Nevertheless, existing works assume recipes in source and target domains are mostly originated from the same cuisine and written in the same language. This paper studies unsupervised domain adaptation for image-to-recipe retrieval, where recipes in source and target domains are in different languages. Moreover, only recipes are available for training in the target domain. A novel recipe mixup method is proposed to learn transferable embedding features between the two domains. Specifically, recipe mixup produces mixed recipes to form an intermediate domain by discretely exchanging the section(s) between source and target recipes. To bridge the domain gap, recipe mixup loss is proposed to enforce the intermediate domain to locate in the shortest geodesic path between source and target domains in the recipe embedding space. By using Recipe 1M dataset as source domain (English) and Vireo-FoodTransfer dataset as target domain (Chinese), empirical experiments verify the effectiveness of recipe mixup for cross-lingual adaptation in the context of image-to-recipe retrieval.
翻译:近年来,由于提供了用于培训的大规模配对数据,交叉食谱检索引起了研究关注。然而,如果不是不可能的话,也很难获得覆盖大多数烹饪用于监督学习的烹饪的配方图像配对。通过将从数据丰富的烹饪中获得的知识转让给数据丰富的烹饪烹饪烹饪,领域适应可以揭示这一实际问题。然而,现有作品假设源和目标领域的配方大多来自同一烹饪,用同一语言写成。本文研究未经监督的图像到Recipe检索域的调适,其中源和目标领域的配方以不同语言提供。此外,在目标领域只有可供培训的配方。建议采用新颖的配方混合方法学习两个领域之间可转让的嵌入特征。具体地说,配方的配方产生混合配方,通过将源和目标领域的配方与目标配方互换。为了缩小域差异,拟将中间域的互错配方损失用于将中间域定位在最短的地缘路径上,即来源和目标领域的配方与目标领域之间的配方,通过英语领域校验数据来源。