Information that is of relevance for decision-making is often distributed, and held by self-interested agents. Decision markets are well-suited mechanisms to elicit such information and aggregate it into conditional forecasts that can be used for decision-making. However, for incentive-compatible elicitation, decision markets rely on stochastic decision rules which entails that sometimes actions have to be taken that have been predicted to be sub-optimal. In this work, we propose three closely related mechanisms that elicit and aggregate information similar to a decision market, but are incentive compatible despite using a deterministic decision rule. Following ideas from peer prediction mechanisms, proxies rather than observed future outcomes are used to score predictions. The first mechanism requires the principal to have her own signal, which is then used as a proxy to elicit information from a group of self-interested agents. The principal then deterministically maps the aggregated forecasts and the proxy to the best possible decision. The second and third mechanisms expand the first to cover a scenario where the principal does not have access to her own signal. The principal offers a partial profit to align the interest of one agent and retrieve its signal as a proxy; or alternatively uses a proper peer prediction mechanism to elicit signals from two agents. Aggregation and decision-making then follow the first mechanism. We evaluate our first mechanism using a multi-agent bandit learning system. The result suggests that the mechanism can train agents to achieve a performance similar to a Bayesian inference model with access to all information held by the agents.
翻译:信息的相关性通常是分散并由自私的代理持有。通过决策市场,可以搜集这些信息并合并它们成有条件的预报,从而适用于决策。但是,激励兼容性采集基于随机决策规则,这意味着有时必须采取预测的次优行动。本文提出了三种增加纠错性规则的机制,类似于决策市场会收集并合并信息,但采用确定性决策规则。与对等评估机制的思想一样,代理机制使用代理来评分,而不是观察到的未来结果。第一种机制需要委托者拥有自己的信号,然后将其用作代理人从一组自私的代理人那里收集信息。委托者然后按最佳可能决策μ运作聚合的预测&
代理人和代理。第二种和第三种机制扩展了第一种机制,以涵盖委托人无权访问自身信号的情况。委托者为该代理人提供部分利润,以促进该代理人的利益一致性,从而检索其信号作为代理;或使用适当的同行评估机制从两个代理人那里收集信号。然后进行汇总和决策,随后采用第一种机制。我们使用多代理大型计算机学习系统评估了我们的第一种机制。结果表明,该机制可以训练代理以实现与具有代理所持有的所有信息的贝叶斯推理模型类似的性能。