Myopic strategy is one of the most important strategies when studying bandit problems. In this paper, we consider the two-armed bandit problem proposed by Feldman. With general distributions and utility functions, we obtain a necessary and sufficient condition for the optimality of the myopic strategy. As an application, we could solve Nouiehed and Ross's conjecture for Bernoulli two-armed bandit problems that myopic strategy stochastically maximizes the number of wins.
翻译:短视战略是研究强盗问题的最重要战略之一。 在本文中,我们考虑了费尔德曼提出的两武装强盗问题。 有了一般分布和通用功能,我们获得一个必要和充分的条件来优化短视战略。 作为一种应用,我们可以解决努埃赫德和罗斯对伯努利两武装强盗问题的推测,而这种预测是近距离战略能够使赢家人数最大化的。