Reducing hidden bias in the data and ensuring fairness in algorithmic data analysis has recently received significant attention. We complement several recent papers in this line of research by introducing a general method to reduce bias in the data through random projections in a "fair" subspace. We apply this method to densest subgraph problem. For densest subgraph, our approach based on fair projections allows to recover both theoretically and empirically an almost optimal, fair, dense subgraph hidden in the input data. We also show that, under the small set expansion hypothesis, approximating this problem beyond a factor of 2 is NP-hard and we show a polynomial time algorithm with a matching approximation bound.
翻译:减少数据中隐藏的偏差,确保算法数据分析的公平性,最近引起了人们的极大关注。我们对这一研究领域最近的一些论文进行了补充,在“公平”子空间引入了一种通过随机预测减少数据偏差的一般方法。我们将这种方法应用于最稠密的地下测量问题。对于最稠密的地下测量问题,我们基于公平预测的方法在理论上和经验上都能够恢复输入数据中隐藏的几乎是最佳的、公平的、密集的地下测量。我们还表明,在小规模的扩展假设下,将这个问题超过2倍的伸缩问题近似于2的NP硬度,并且我们展示了一种具有匹配近似约束的多元时间算法。