We consider selecting the top-$m$ alternatives from a finite number of alternatives via Monte Carlo simulation. Under a Bayesian framework, we formulate the sampling decision as a stochastic dynamic programming problem, and develop a sequential sampling policy that maximizes a value function approximation one-step look ahead. To show the asymptotic optimality of the proposed procedure, the asymptotically optimal sampling ratios which optimize large deviations rate of the probability of false selection for selecting top-$m$ alternatives has been rigorously defined. The proposed sampling policy is not only proved to be consistent but also achieves the asymptotically optimal sampling ratios. Numerical experiments demonstrate superiority of the proposed allocation procedure over existing ones.
翻译:我们考虑通过Monte Carlo模拟从一定数量的替代品中选择最高至百万美元的替代品。在Bayesian框架下,我们将抽样决定作为随机动态程序拟定问题,并制定一项顺序抽样政策,使价值函数的近似值最大化。为了显示拟议程序的无症状最佳性,已经严格界定了无症状最佳采样比率,该比率优化了选择最高至百万美元替代品的虚假选择概率的巨大偏差率。 拟议的采样政策不仅证明是一致的,而且还实现了无症状最佳采样比率。 数字实验表明,拟议的采样程序优于现有的采样程序。