We study the experimentation dynamics of a decision maker (DM) in a two-armed bandit setup (Bolton and Harris (1999)), where the agent holds ambiguous beliefs regarding the distribution of the return process of one arm and is certain about the other one. The DM entertains Multiplier preferences a la Hansen and Sargent (2001), thus we frame the decision making environment as a two-player differential game against nature in continuous time. We characterize the DM value function and her optimal experimentation strategy that turns out to follow a cut-off rule with respect to her belief process. The belief threshold for exploring the ambiguous arm is found in closed form and is shown to be increasing with respect to the ambiguity aversion index. We then study the effect of provision of an unambiguous information source about the ambiguous arm. Interestingly, we show that the exploration threshold rises unambiguously as a result of this new information source, thereby leading to more conservatism. This analysis also sheds light on the efficient time to reach for an expert opinion.
翻译:我们研究了在双臂匪帮组织(Bolton和Harris(1999))中决策者(DM)的实验动态,该代理人对一个手臂的返回过程的分布持有模糊的信念,对另一臂持肯定态度。DM认为乘倍者偏爱汉森和Sargent(2001年),因此我们把决策环境看成是连续与自然对立的双人差别游戏。我们把DM的价值功能及其最佳试验战略定性为在她的信仰过程中遵循一个截断规则。探索模糊手臂的信仰门槛以封闭的形式发现,在模糊反向指数方面显示正在增加。我们接着研究了提供模糊手臂的明确信息来源的效果。有趣的是,我们表明探索门槛由于这一新的信息源而明显上升,从而导致更多的保守主义。这一分析还揭示了专家意见的有效时间。