Facial attributes (\eg, age and attractiveness) estimation performance has been greatly improved by using convolutional neural networks. However, existing methods have an inconsistency between the training objectives and the evaluation metric, so they may be suboptimal. In addition, these methods always adopt image classification or face recognition models with a large amount of parameters, which carry expensive computation cost and storage overhead. In this paper, we firstly analyze the essential relationship between two state-of-the-art methods (Ranking-CNN and DLDL) and show that the Ranking method is in fact learning label distribution implicitly. This result thus firstly unifies two existing popular state-of-the-art methods into the DLDL framework. Second, in order to alleviate the inconsistency and reduce resource consumption, we design a lightweight network architecture and propose a unified framework which can jointly learn facial attribute distribution and regress attribute value. The effectiveness of our approach has been demonstrated on both facial age and attractiveness estimation tasks. Our method achieves new state-of-the-art results using the single model with 36$\times$ fewer parameters and 3$\times$ faster inference speed on facial age/attractiveness estimation. Moreover, our method can achieve comparable results as the state-of-the-art even though the number of parameters is further reduced to 0.9M (3.8MB disk storage).
翻译:使用进化神经网络,显著改进了地貌特征( eg、 年龄和吸引力) 估计性能; 然而, 现有方法在培训目标和评价衡量标准之间有不一致之处, 因而可能不理想。 此外, 这些方法总是采用图像分类或面对识别模型, 其参数众多, 计算成本和存储管理费用昂贵。 在本文中, 我们首先分析两种最先进的方法( Ranking- CNN 和 DLDLL) 之间的基本关系, 并表明排序方法实际上是隐含地学习标签的分布。 因此, 首先, 将两种现有的流行的最新技术方法统一到 DLDL 框架, 因而可能不够理想。 其次, 为了减轻不一致性, 减少资源消耗, 我们设计了一个轻量的网络架构, 并提出一个统一框架, 可以共同学习面部属性分布和缩影属性价值。 我们的方法的有效性表现在面部年龄和吸引力估计任务上。 我们的方法取得了新的状态结果, 使用单一模型, 以36 美元 的参数和 3 美元 的存储参数和3 美元 流缩度值参数可以进一步快速评估。