We consider a Bayesian forecast aggregation model where $n$ experts, after observing private signals about an unknown binary event, report their posterior beliefs about the event to a principal, who then aggregates the reports into a single prediction for the event. The signals of the experts and the outcome of the event follow a joint distribution that is unknown to the principal, but the principal has access to i.i.d. "samples" from the distribution, where each sample is a tuple of experts' reports (not signals) and the realization of the event. Using these samples, the principal aims to find an $\varepsilon$-approximately optimal aggregator, where optimality is measured in terms of the expected squared distance between the aggregated prediction and the realization of the event. We show that the sample complexity of this problem is at least $\tilde \Omega(m^{n-2} / \varepsilon)$ for arbitrary discrete distributions, where $m$ is the size of each expert's signal space. This sample complexity grows exponentially in the number of experts $n$. But, if experts' signals are independent conditioned on the realization of the event, then the sample complexity is significantly reduced, to $\tilde O(1 / \varepsilon^2)$, which does not depend on $n$. Finally, we generalize our model to non-binary events and obtain sample complexity bounds that depend on the event space size.
翻译:我们考虑一种巴伊西亚预测汇总模型,即专家在对未知的二进制事件观测私人信号后,向一位负责人报告其事后对事件的看法,然后由他将报告汇总为单一的事件预测。专家的信号和事件结果遵循的是本校所不知道的联合分发,但本校可以查阅发行的i.d.“样本”,每份样本都是专家报告(而不是信号)的图示(而不是信号)和事件的实现。利用这些样本,主要目的是找到一个美元-大约是最佳的集成器,根据预计的综合预测与事件实现之间的平方距离来衡量最佳性。我们表明,这一问题的抽样复杂性至少是$tilde \ Omega(m ⁇ -2}/\\ varepslon) $,其中每个样本是专家报告($)的大小。这种样本复杂性在专家人数上呈指数指数指数性增长 $($ $) 。但是,如果我们的样本事件是最终实现的不复杂程度,那么当我们实现的话。