As the scope of machine learning broadens, we observe a recurring theme of algorithmic monoculture: the same systems, or systems that share components (e.g. training data), are deployed by multiple decision-makers. While sharing offers clear advantages (e.g. amortizing costs), does it bear risks? We introduce and formalize one such risk, outcome homogenization: the extent to which particular individuals or groups experience negative outcomes from all decision-makers. If the same individuals or groups exclusively experience undesirable outcomes, this may institutionalize systemic exclusion and reinscribe social hierarchy. To relate algorithmic monoculture and outcome homogenization, we propose the component-sharing hypothesis: if decision-makers share components like training data or specific models, then they will produce more homogeneous outcomes. We test this hypothesis on algorithmic fairness benchmarks, demonstrating that sharing training data reliably exacerbates homogenization, with individual-level effects generally exceeding group-level effects. Further, given the dominant paradigm in AI of foundation models, i.e. models that can be adapted for myriad downstream tasks, we test whether model sharing homogenizes outcomes across tasks. We observe mixed results: we find that for both vision and language settings, the specific methods for adapting a foundation model significantly influence the degree of outcome homogenization. We conclude with philosophical analyses of and societal challenges for outcome homogenization, with an eye towards implications for deployed machine learning systems.
翻译:随着机器学习范围的扩大,我们观察到一个反复出现的算法单一文化主题:由多个决策者部署相同的系统或共享组成部分的系统(例如培训数据),这些系统或共享组成部分的系统(例如培训数据),由多个决策者共同使用。虽然共享具有明显的优势(例如摊销成本),但它是否包含风险?我们引入并正式确定一个风险,结果同质化:特定个人或群体在多大程度上会从所有决策者那里经历负面结果。如果同一个人或群体完全经历不理想的结果,这可能使系统排斥制度化,并重新设置社会等级。为了将算法单一文化和结果同质化联系起来,我们提出部分分享假设:如果决策者分享培训数据或特定模型等组成部分,那么它们就会产生更均匀的结果。我们测试这种关于算法公平基准的假设,表明共享培训数据会可靠地加剧同质化,而个人层面的影响一般超过集团一级的影响。此外,鉴于基础模型中的主要模式,即可以适应众多下游任务的模式,我们测试模式是否将结果同化。我们观察混合的结果:我们发现,我们发现,对于愿景和具体语言结果的模型分析,我们发现,将采用一种分析方法来改变一个分析。