Submodular function optimization has numerous applications in machine learning and data analysis, including data summarization which aims to identify a concise and diverse set of data points from a large dataset. It is important to implement fairness-aware algorithms when dealing with data items that may contain sensitive attributes like race or gender, to prevent biases that could lead to unequal representation of different groups. With this in mind, we investigate the problem of maximizing a monotone submodular function while meeting group fairness constraints. Unlike previous studies in this area, we allow for randomized solutions, with the objective being to calculate a distribution over feasible sets such that the expected number of items selected from each group is subject to constraints in the form of upper and lower thresholds, ensuring that the representation of each group remains balanced in the long term. Here a set is considered feasible if its size does not exceed a constant value of $b$. Our research includes the development of a series of approximation algorithms for this problem.
翻译:----
通过随机化实现子模型最大化的长期公平性
Submodular function optimization has numerous applications in machine learning and data analysis, including data summarization which aims to identify a concise and diverse set of data points from a large dataset. It is important to implement fairness-aware algorithms when dealing with data items that may contain sensitive attributes like race or gender, to prevent biases that could lead to unequal representation of different groups. With this in mind, we investigate the problem of maximizing a monotone submodular function while meeting group fairness constraints. Unlike previous studies in this area, we allow for randomized solutions, with the objective being to calculate a distribution over feasible sets such that the expected number of items selected from each group is subject to constraints in the form of upper and lower thresholds, ensuring that the representation of each group remains balanced in the long term. Here a set is considered feasible if its size does not exceed a constant value of $b$. Our research includes the development of a series of approximation algorithms for this problem.
子模函数优化具有机器学习和数据分析等多种应用,包括数据汇总,旨在从大型数据集中识别简洁多样的数据点。在处理可能包含种族或性别等敏感属性的数据项时,实施公平感知算法非常重要,以防止偏见导致不同群体的不平等代表。鉴于此,我们调查了最大化单调子模函数以满足群体公平约束的问题。与该领域先前的研究不同,我们允许随机化解决方案,其目标是计算一个分布,以便从具有上下阈值形式约束的群体中选择的预期项目数量受到限制,从而确保每个群体的代表性在长期内保持平衡。这里如果其大小不超过$b$的常数值,则认为集合是可行的。我们的研究包括开发一系列逼近算法来解决这个问题。