Assuming distributions are Gaussian often facilitates computations that are otherwise intractable. We consider an agent who is designed to attain a low information ratio with respect to a bandit environment with a Gaussian prior distribution and a Gaussian likelihood function, but study the agent's performance when applied instead to a Bernoulli bandit. We establish a bound on the increase in Bayesian regret when an agent interacts with the Bernoulli bandit, relative to an information-theoretic bound satisfied with the Gaussian bandit. If the Gaussian prior distribution and likelihood function are sufficiently diffuse, this increase grows with the square-root of the time horizon, and thus the per-timestep increase vanishes. Our results formalize the folklore that so-called Bayesian agents remain effective when instantiated with diffuse misspecified distributions.
翻译:假设分配是Gaussian, 通常会促进本可难处理的计算。 我们考虑一个代理人,设计该代理人是为了在高斯先前的分布和高斯概率函数下,在盗匪环境中达到低信息比率,但研究代理人在应用到Bernoulli盗匪时的表现。当代理人与Bernoulli盗匪发生互动时,我们确定贝叶西亚人遗憾增加的界限,相对于对高斯盗匪满意的信息理论约束而言。如果高斯人先前的分布和可能性功能足够分散,这种增加随着时间范围平原而增加,因此,每一步的增加消失。我们的结果将所谓的巴伊斯人在扩散错误分布时仍然有效的民俗正式化。