Background: Maintaining a healthy diet is vital to avoid health-related issues, e.g., undernutrition, obesity and many non-communicable diseases. An indispensable part of the health diet is dietary assessment. Traditional manual recording methods are burdensome and contain substantial biases and errors. Recent advances in Artificial Intelligence, especially computer vision technologies, have made it possible to develop automatic dietary assessment solutions, which are more convenient, less time-consuming and even more accurate to monitor daily food intake. Scope and approach: This review presents one unified Vision-Based Dietary Assessment (VBDA) framework, which generally consists of three stages: food image analysis, volume estimation and nutrient derivation. Vision-based food analysis methods, including food recognition, detection and segmentation, are systematically summarized, and methods of volume estimation and nutrient derivation are also given. The prosperity of deep learning makes VBDA gradually move to an end-to-end implementation, which applies food images to a single network to directly estimate the nutrition. The recently proposed end-to-end methods are also discussed. We further analyze existing dietary assessment datasets, indicating that one large-scale benchmark is urgently needed, and finally highlight key challenges and future trends for VBDA. Key findings and conclusions: After thorough exploration, we find that multi-task end-to-end deep learning approaches are one important trend of VBDA. Despite considerable research progress, many challenges remain for VBDA due to the meal complexity. We also provide the latest ideas for future development of VBDA, e.g., fine-grained food analysis and accurate volume estimation. This survey aims to encourage researchers to propose more practical solutions for VBDA.
翻译:保持健康饮食对于避免与健康有关的问题至关重要,例如营养不足、肥胖症和许多非传染性疾病。健康饮食的一个不可或缺的部分是饮食评估。传统手工记录方法繁琐繁琐,含有重大偏差和错误。人工智能,特别是计算机视觉技术的最新进展使得有可能开发自动饮食评估解决方案,这些解决方案更方便、更节省时间、甚至更准确地监测日常食物摄入量。范围和方法:本次审查提供了一个统一的基于愿景的饮食评估框架,通常包括三个阶段:食品图像分析、数量估计和营养营养衍生。基于愿景的食品分析方法,包括食品识别、检测和分解,是累赘信息,是系统性的,数量估计和营养衍生方法的近期进展。深度学习的繁荣使得VBDA逐渐转向端,将食品图像应用到一个单一的网络来直接估算营养量。最近提出的端对端至端方法。我们进一步分析现有的饮食评估数据集,表明基于愿景的食品诊断、检测和分层分析,也鼓励对食品诊断的近期目标进行系统总结,从而最终确定VDA的下一个关键趋势。