We initiate a principled study of algorithmic collective action on digital platforms that deploy machine learning algorithms. We propose a simple theoretical model of a collective interacting with a firm's learning algorithm. The collective pools the data of participating individuals and executes an algorithmic strategy by instructing participants how to modify their own data to achieve a collective goal. We investigate the consequences of this model in three fundamental learning-theoretic settings: the case of a nonparametric optimal learning algorithm, a parametric risk minimizer, and gradient-based optimization. In each setting, we come up with coordinated algorithmic strategies and characterize natural success criteria as a function of the collective's size. Complementing our theory, we conduct systematic experiments on a skill classification task involving tens of thousands of resumes from a gig platform for freelancers. Through more than two thousand model training runs of a BERT-like language model, we see a striking correspondence emerge between our empirical observations and the predictions made by our theory. Taken together, our theory and experiments broadly support the conclusion that algorithmic collectives of exceedingly small fractional size can exert significant control over a platform's learning algorithm.
翻译:我们开始对数字平台上应用机器学习算法的算法集体行动进行有原则的研究。 我们提出一个简单的理论模式,与企业的学习算法进行集体互动。 集体汇集参与的个人数据,并通过指导参与者如何修改自己的数据以实现集体目标来实施算法战略。 我们在三个基本的学习理论环境中调查了这一模式的后果:非参数最佳学习算法、参数风险最小化和梯度优化。 在每一个设置中,我们提出一个协调的算法战略,并将自然成功标准定性为集体规模的函数。 补充我们的理论,我们从自由职业者的工作平台上对涉及数以万计的履历的技能分类任务进行系统实验。 通过两千多部类似BERT语言模型的模型培训,我们看到我们的经验观察和理论预测之间出现了惊人的对应关系。 我们的理论和实验都广泛支持这样的结论,即极小的分数规模的算法集体可以对平台的学习算法施加重要的控制。