Automatically understanding and recognising human affective states using images and computer vision can improve human-computer and human-robot interaction. However, privacy has become an issue of great concern, as the identities of people used to train affective models can be exposed in the process. For instance, malicious individuals could exploit images from users and assume their identities. In addition, affect recognition using images can lead to discriminatory and algorithmic bias, as certain information such as race, gender, and age could be assumed based on facial features. Possible solutions to protect the privacy of users and avoid misuse of their identities are to: (1) extract anonymised facial features, namely action units (AU) from a database of images, discard the images and use AUs for processing and training, and (2) federated learning (FL) i.e. process raw images in users' local machines (local processing) and send the locally trained models to the main processing machine for aggregation (central processing). In this paper, we propose a two-level deep learning architecture for affect recognition that uses AUs in level 1 and FL in level 2 to protect users' identities. The architecture consists of recurrent neural networks to capture the temporal relationships amongst the features and predict valence and arousal affective states. In our experiments, we evaluate the performance of our privacy-preserving architecture using different variations of recurrent neural networks on RECOLA, a comprehensive multimodal affective database. Our results show state-of-the-art performance of $0.426$ for valence and $0.401$ for arousal using the Concordance Correlation Coefficient evaluation metric, demonstrating the feasibility of developing models for affect recognition that are both accurate and ensure privacy.
翻译:使用图像和计算机视觉自动理解和认识人的感官状态,可以改善人类计算机和人类机器人的互动。但是,隐私已经成为一个令人极为关切的问题,因为用于培训感官模型的人的身份可以在此过程中暴露出来。例如,恶意个人可以利用用户的图像,并假定他们的身份。此外,使用图像影响识别,可能导致歧视和逻辑偏差,因为某些信息,如种族、性别和年龄等,可以以面部特征为基础假设。保护用户隐私和避免滥用其身份的可能办法如下:(1) 从图像数据库中提取匿名面部特征,即行动单位(AU),即行动单位(AU),从图像数据库中丢弃图像和使用AU进行感官模型进行处理和培训,以及(2) 联邦学习(FL),即用户的当地机器(当地处理)处理原始图像,并将经过当地培训的模型发送到主要处理机(中央处理),在本文件中,我们提议一个两级的深度学习架构,用于影响对一级和FL级的AU的认知,以保护用户身份。在经常性神经网络中,对经常性的神经模型进行影响常规测试,用来预测我们系统结构中的不同性变变变,以显示我们的货币结构,以显示我们的货币结构,以显示我们的货币结构的货币结构,以显示我们的货币结构的货币结构。