We propose and analyze an algorithmic framework for "bias bounties": events in which external participants are invited to propose improvements to a trained model, akin to bug bounty events in software and security. Our framework allows participants to submit arbitrary subgroup improvements, which are then algorithmically incorporated into an updated model. Our algorithm has the property that there is no tension between overall and subgroup accuracies, nor between different subgroup accuracies, and it enjoys provable convergence to either the Bayes optimal model or a state in which no further improvements can be found by the participants. We provide formal analyses of our framework, experimental evaluation, and findings from a preliminary bias bounty event.
翻译:我们提议并分析一个“bias bunities”的算法框架:邀请外部参与者提出改进一个类似于软件和安全的错误赏金事件那样的经过培训的模式的事件。 我们的框架允许参与者提交任意的分组改进,然后在逻辑上将这些改进纳入更新的模式。 我们的算法的属性是,整体和分组的宽度之间以及不同分组的宽度之间没有紧张关系,它与巴伊斯最佳模式或参与者无法找到进一步改进的状态有明显的趋同。 我们对我们的框架、实验性评估和初步偏差宽事件的调查结果进行了正式分析。