Camera-based physiological measurement is a growing field with neural models providing state-the-art-performance. Prior research have explored various "end-to-end" models; however these methods still require several preprocessing steps. These additional operations are often non-trivial to implement making replication and deployment difficult and can even have a higher computational budget than the "core" network itself. In this paper, we propose two novel and efficient neural models for camera-based physiological measurement called EfficientPhys that remove the need for face detection, segmentation, normalization, color space transformation or any other preprocessing steps. Using an input of raw video frames, our models achieve strong performance on three public datasets. We show that this is the case whether using a transformer or convolutional backbone. We further evaluate the latency of the proposed networks and show that our most light weight network also achieves a 33% improvement in efficiency.
翻译:以相机为基础的生理测量是一个不断增长的领域,神经模型可以提供国家艺术性能。 先前的研究已经探索了各种“ 端到端” 模型; 但是,这些方法仍然需要若干预处理步骤。 这些额外的操作往往非三元性,难以进行复制和部署,甚至可以比“ 核心” 网络本身有更高的计算预算。 在本文中,我们提出了两个新的高效神经模型,用于以相机为基础的生理测量,称为“高效的物理模型,可以消除对面部检测、分解、正常化、色彩空间转换或任何其他预处理步骤的需要。我们使用原始视频框架的输入,我们的模型可以在三个公共数据集上取得很强的性能。我们表明,无论使用变压器还是变压骨架都是如此。我们进一步评估了拟议网络的亮度,并表明我们最轻重的网络效率也提高了33%。