Modern deep learning techniques have enabled advances in image-based dietary assessment such as food recognition and food portion size estimation. Valuable information on the types of foods and the amount consumed are crucial for prevention of many chronic diseases. However, existing methods for automated image-based food analysis are neither end-to-end nor are capable of processing multiple tasks (e.g., recognition and portion estimation) together, making it difficult to apply to real life applications. In this paper, we propose an image-based food analysis framework that integrates food localization, classification and portion size estimation. Our proposed framework is end-to-end, i.e., the input can be an arbitrary food image containing multiple food items and our system can localize each single food item with its corresponding predicted food type and portion size. We also improve the single food portion estimation by consolidating localization results with a food energy distribution map obtained by conditional GAN to generate a four-channel RGB-Distribution image. Our end-to-end framework is evaluated on a real life food image dataset collected from a nutrition feeding study.
翻译:现代深层学习技术使基于图像的饮食评估,如食品识别和食品部分规模估计等,取得了进步。关于食物类型和消费量的宝贵信息对于预防许多慢性疾病至关重要。然而,基于图像的食品分析自动化现有方法既不是端至端的,也不能同时处理多种任务(如确认和部分估计),因此难以应用于现实生活应用。本文提出一个基于图像的食品分析框架,将食品本地化、分类和部分规模估计结合起来。我们提议的框架是端至端的,即投入可以是含有多种食品的任意食品形象,而我们的系统可以将每个单一食品项目及其相应的预测食物类型和部分大小本地化。我们还改进单一食品部分的估算,将本地化结果与有条件的GAN获得的粮食能源分配图结合起来,以生成一个4个通道的 RGB-分布图。我们的端至端框架是根据从营养喂养研究中收集的真实食物图像集来进行评估的。