Principal component analysis (PCA), a ubiquitous dimensionality reduction technique in signal processing, searches for a projection matrix that minimizes the mean squared error between the reduced dataset and the original one. Since classical PCA is not tailored to address concerns related to fairness, its application to actual problems may lead to disparity in the reconstruction errors of different groups (e.g., men and women, whites and blacks, etc.), with potentially harmful consequences such as the introduction of bias towards sensitive groups. Although several fair versions of PCA have been proposed recently, there still remains a fundamental gap in the search for algorithms that are simple enough to be deployed in real systems. To address this, we propose a novel PCA algorithm which tackles fairness issues by means of a simple strategy comprising a one-dimensional search which exploits the closed-form solution of PCA. As attested by numerical experiments, the proposal can significantly improve fairness with a very small loss in the overall reconstruction error and without resorting to complex optimization schemes. Moreover, our findings are consistent in several real situations as well as in scenarios with both unbalanced and balanced datasets.
翻译:主要组成部分分析(PCA)是信号处理中无处不在的减少维度技术(PCA),在信号处理中寻找一个预测矩阵,最大限度地减少减少减少的数据集与原始数据集之间的平均平方差错。由于古典的CPA不是专门为处理与公平有关的关切问题而设计的,因此对实际问题的应用可能导致不同群体(如男女、白人和黑人等)重建错误的差异,产生潜在的有害后果,例如对敏感群体采取偏见。虽然最近提出了多种公平的CPA版本,但在寻找能够被实际系统应用到的简单算法方面仍然存在根本差距。为了解决这个问题,我们建议采用新的CPA算法,通过简单战略解决公平问题,包括利用CPC的封闭式解决办法进行单维搜索。正如数字实验所证明的那样,该提案可以大大改善公平性,在整个重建错误中损失很小,而不必采用复杂的优化计划。此外,我们的调查结果在若干真实情况下,以及在有不平衡和平衡的假设中是一致的。