Deep neural networks have proved hugely successful, achieving human-like performance on a variety of tasks. However, they are also computationally expensive, which has motivated the development of model compression techniques which reduce the resource consumption associated with deep learning models. Nevertheless, recent studies have suggested that model compression can have an adverse effect on algorithmic fairness, amplifying existing biases in machine learning models. With this project we aim to extend those studies to the context of facial expression recognition. To do that, we set up a neural network classifier to perform facial expression recognition and implement several model compression techniques on top of it. We then run experiments on two facial expression datasets, namely the Extended Cohn-Kanade Dataset (CK+DB) and the Real-World Affective Faces Database (RAF-DB), to examine the individual and combined effect that compression techniques have on the model size, accuracy and fairness. Our experimental results show that: (i) Compression and quantisation achieve significant reduction in model size with minimal impact on overall accuracy for both CK+DB and RAF-DB; (ii) in terms of model accuracy, the classifier trained and tested on RAF-DB seems more robust to compression compared to the CK+ DB; (iii) for RAF-DB, the different compression strategies do not seem to increase the gap in predictive performance across the sensitive attributes of gender, race and age which is in contrast with the results on the CK+DB, where compression seems to amplify existing biases for gender. We analyse the results and discuss the potential reasons for our findings.
翻译:深心神经网络已证明非常成功,在各种任务中实现了人性化表现。然而,这些网络在计算上也是昂贵的,这促使开发了模型压缩技术,减少了与深层学习模式相关的资源消耗。然而,最近的研究表明,模型压缩可能对算法公平产生不利影响,扩大了机器学习模式中现有的偏差。我们的目标是将这些研究扩大到面部表情识别方面。为了做到这一点,我们设置了一个神经网络分类器,以进行面部表达识别,并在顶部采用若干模型压缩技术。我们随后对两个面部表达数据集进行了实验,即Cohn-Kanade数据集(CK+DB)和Real-World Affective Face数据库(RAF-D),以检查压缩技术对模型大小、准确性和公平性的影响。我们的实验结果表明:(一) 压缩和定量网络的大小显著缩小,对CK+DB和RAFD的总体准确性能影响最小。 (二) 从模型准确性角度来说,GLGK的等级分析结果似乎在C至DB的等级上更可靠,在C至DB的等级的等级上,我们似乎对DB级分析结果进行了更精确到不同的分析。