Facial landmark detection is a widely researched field of deep learning as this has a wide range of applications in many fields. These key points are distinguishing characteristic points on the face, such as the eyes center, the eye's inner and outer corners, the mouth center, and the nose tip from which human emotions and intent can be explained. The focus of our work has been evaluating transfer learning models such as MobileNetV2 and NasNetMobile, including custom CNN architectures. The objective of the research has been to develop efficient deep learning models in terms of model size, parameters, and inference time and to study the effect of augmentation imputation and fine-tuning on these models. It was found that while augmentation techniques produced lower RMSE scores than imputation techniques, they did not affect the inference time. MobileNetV2 architecture produced the lowest RMSE and inference time. Moreover, our results indicate that manually optimized CNN architectures performed similarly to Auto Keras tuned architecture. However, manually optimized architectures yielded better inference time and training curves.
翻译:法西斯里程碑探测是一个广泛研究的深层学习领域,因为它在许多领域有着广泛的应用领域。 这些关键点区分了表面的特征点,如眼睛中心、眼睛的内角和外角、口角和可以解释人类情感和意图的鼻尖。我们的工作重点是评价移动网络2和NasNetMobile等转移学习模型,包括自定义CNN结构。研究的目的是在模型大小、参数和推算时间方面开发高效的深层学习模型,并研究扩增预测和微调对这些模型的影响。发现虽然扩增技术产生的RMSE分数低于估算技术,但它们并没有影响推算时间。移动网络2结构产生了最低的RMSE和推算时间。此外,我们的结果显示,手动优化CNN结构与Auto Keras调整结构类似。然而,人工优化结构产生更好的推算时间和培训曲线。