An increasingly common setting in machine learning involves multiple parties, each with their own data, who want to jointly make predictions on future test points. Agents wish to benefit from the collective expertise of the full set of agents to make better predictions than they would individually, but may not be willing to release their data or model parameters. In this work, we explore a decentralized mechanism to make collective predictions at test time, leveraging each agent's pre-trained model without relying on external validation, model retraining, or data pooling. Our approach takes inspiration from the literature in social science on human consensus-making. We analyze our mechanism theoretically, showing that it converges to inverse meansquared-error (MSE) weighting in the large-sample limit. To compute error bars on the collective predictions we propose a decentralized Jackknife procedure that evaluates the sensitivity of our mechanism to a single agent's prediction. Empirically, we demonstrate that our scheme effectively combines models with differing quality across the input space. The proposed consensus prediction achieves significant gains over classical model averaging, and even outperforms weighted averaging schemes that have access to additional validation data.
翻译:机器学习越来越常见的环境涉及多个方面,每个方面都有自己的数据,它们都希望共同对未来的测试点作出预测。 代理人希望受益于全方位代理人的集体专门知识,以作出比他们个人更好的预测,但可能不愿意公布其数据或模型参数。 在这项工作中,我们探索一个分散的机制,在试验时间作出集体预测,利用每个代理人预先培训的模式,而不必依靠外部验证、模式再培训或数据汇集。 我们的方法吸收了社会科学文献中关于人类达成共识的启发。 我们从理论上分析了我们的机制,表明在大抽样限度内,它会与整个代理人的集体专门知识相融合,从而得出更好的预测,但可能不愿意发布自己的数据或模型参数参数。 在这项工作中,我们提出一个分散的Jackknife程序,用以评估我们机制对单一代理人预测的敏感性。 生动地说,我们的办法有效地结合了不同质量的模型,整个输入空间。 拟议的共识预测在古典模型的平均值上取得了显著的收益,甚至超出可获取额外验证数据的加权平均平均计划。