对同一个人的挑剔: 单方文化的算法是否导致结果同质化? (Picking on the Same Person: Does Algorithmic Monoculture lead to Outcome Homogenization?)

As the scope of machine learning broadens, we observe a recurring theme of algorithmic monoculture: the same systems, or systems that share components (e.g. training data), are deployed by multiple decision-makers. While sharing offers clear advantages (e.g. amortizing costs), does it bear risks? We introduce and formalize one such risk, outcome homogenization: the extent to which particular individuals or groups experience negative outcomes from all decision-makers. If the same individuals or groups exclusively experience undesirable outcomes, this may institutionalize systemic exclusion and reinscribe social hierarchy. To relate algorithmic monoculture and outcome homogenization, we propose the component-sharing hypothesis: if decision-makers share components like training data or specific models, then they will produce more homogeneous outcomes. We test this hypothesis on algorithmic fairness benchmarks, demonstrating that sharing training data reliably exacerbates homogenization, with individual-level effects generally exceeding group-level effects. Further, given the dominant paradigm in AI of foundation models, i.e. models that can be adapted for myriad downstream tasks, we test whether model sharing homogenizes outcomes across tasks. We observe mixed results: we find that for both vision and language settings, the specific methods for adapting a foundation model significantly influence the degree of outcome homogenization. We conclude with philosophical analyses of and societal challenges for outcome homogenization, with an eye towards implications for deployed machine learning systems.

翻译：随着机器学习范围的扩大,我们观察到一个反复出现的算法单一文化主题:由多个决策者部署相同的系统或共享组成部分的系统(例如培训数据),这些系统或共享组成部分的系统(例如培训数据),由多个决策者共同使用。虽然共享具有明显的优势(例如摊销成本),但它是否包含风险?我们引入并正式确定一个风险,结果同质化:特定个人或群体在多大程度上会从所有决策者那里经历负面结果。如果同一个人或群体完全经历不理想的结果,这可能使系统排斥制度化,并重新设置社会等级。为了将算法单一文化和结果同质化联系起来,我们提出部分分享假设:如果决策者分享培训数据或特定模型等组成部分,那么它们就会产生更均匀的结果。我们测试这种关于算法公平基准的假设,表明共享培训数据会可靠地加剧同质化,而个人层面的影响一般超过集团一级的影响。此外,鉴于基础模型中的主要模式,即可以适应众多下游任务的模式,我们测试模式是否将结果同化。我们观察混合的结果:我们发现,我们发现,对于愿景和具体语言结果的模型分析,我们发现,将采用一种分析方法来改变一个分析。

相关内容

MoDELS

关注 0

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

专知会员服务

52+阅读 · 2022年10月22日

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日