强化学习方法处理模型不确定性下的最优设计问题 (A reinforced learning approach to optimal design under model uncertainty)

Optimal designs are usually model-dependent and likely to be sub-optimal if the postulated model is not correctly specified. In practice, it is common that a researcher has a list of candidate models at hand and a design has to be found that is efficient for selecting the true model among the competing candidates and is also efficient (optimal, if possible) for estimating the parameters of the true model. In this article, we use a reinforced learning approach to address this problem. We develop a sequential algorithm, which generates a sequence of designs which have asymptotically, as the number of stages increases, the same efficiency for estimating the parameters in the true model as an optimal design if the true model would have correctly been specified in advance. A lower bound is established to quantify the relative efficiency between such a design and an optimal design for the true model in finite stages. Moreover, the resulting designs are also efficient for discriminating between the true model and other rival models from the candidate list. Some connections with other state-of-the-art algorithms for model discrimination and parameter estimation are discussed and the methodology is illustrated by a small simulation study.

翻译：最优设计通常依赖于模型，并且如果预先设定的模型不正确，很可能是次优的。在实践中，研究人员通常有一系列备选模型，并且必须找到一个能够有效选择真实模型并且对真实模型参数进行优化的设计。本文利用强化学习方法来解决这个问题。我们开发了一个顺序算法，该算法生成一系列设计方案，当阶段数量趋近于无穷大时，其效率与真实模型下的最优设计效率相同，如果真实模型事先正确规定。建立了一个下界来量化这种设计与真实模型最优设计之间的相对效率。此外，所得到的设计对于区分真实模型和其他备选模型也是有效的。我们还讨论了与其他最先进的模型判别和参数估计算法之间的关系，并通过小规模模拟研究来说明该方法的实用性。