Bayesian optimization (BO) is a powerful framework for estimating parameters of computationally expensive simulation models, particularly in settings where the likelihood is intractable and evaluations are costly. In stochastic models every simulation is run with a specific parameter set and an implicit or explicit random seed, where each parameter set and random seed combination generates an individual realization, or trajectory, sampled from an underlying random process. Existing BO approaches typically rely on summary statistics over the realizations, such as means, medians, or quantiles, potentially limiting their effectiveness when trajectory-level information is desired. We propose a trajectory-oriented Bayesian optimization method that incorporates a Gaussian process (GP) surrogate using both input parameters and random seeds as inputs, enabling direct inference at the trajectory level. Using a common random number (CRN) approach, we define a surrogate-based likelihood over trajectories and introduce an adaptive Thompson Sampling algorithm that refines a fixed-size input grid through likelihood-based filtering and Metropolis-Hastings-based densification. This approach concentrates computation on statistically promising regions of the input space while balancing exploration and exploitation. We apply the method to stochastic epidemic models, a simple compartmental and a more computationally demanding agent-based model, demonstrating improved sampling efficiency and faster identification of data-consistent trajectories relative to parameter-only inference.
翻译:贝叶斯优化(BO)是一种强大的框架,用于估计计算成本高昂的仿真模型的参数,特别适用于似然函数难以处理且评估代价高昂的场景。在随机模型中,每次仿真运行都使用特定的参数集和隐式或显式的随机种子,其中每个参数集与随机种子的组合都会生成一个从底层随机过程中采样的个体实现,即轨迹。现有的贝叶斯优化方法通常依赖于对实现的汇总统计量,如均值、中位数或分位数,这在需要轨迹级信息时可能限制其有效性。我们提出了一种面向轨迹的贝叶斯优化方法,该方法结合了以输入参数和随机种子作为输入的高斯过程(GP)代理模型,从而能够在轨迹级别进行直接推断。利用公共随机数(CRN)方法,我们定义了基于代理模型的轨迹似然,并引入了一种自适应汤普森采样算法。该算法通过基于似然的过滤和基于Metropolis-Hastings的加密过程,对固定大小的输入网格进行细化。这种方法将计算集中在输入空间中统计上具有前景的区域,同时平衡探索与利用。我们将该方法应用于随机流行病模型——一个简单的分室模型和一个计算要求更高的基于智能体的模型,结果表明,相对于仅针对参数的推断,该方法提高了采样效率,并能更快地识别出与数据一致的轨迹。