通过基于字典的嵌入式词典对高多元组合空间的优化</s> (Bayesian Optimization over High-Dimensional Combinatorial Spaces via Dictionary-based Embeddings)

We consider the problem of optimizing expensive black-box functions over high-dimensional combinatorial spaces which arises in many science, engineering, and ML applications. We use Bayesian Optimization (BO) and propose a novel surrogate modeling approach for efficiently handling a large number of binary and categorical parameters. The key idea is to select a number of discrete structures from the input space (the dictionary) and use them to define an ordinal embedding for high-dimensional combinatorial structures. This allows us to use existing Gaussian process models for continuous spaces. We develop a principled approach based on binary wavelets to construct dictionaries for binary spaces, and propose a randomized construction method that generalizes to categorical spaces. We provide theoretical justification to support the effectiveness of the dictionary-based embeddings. Our experiments on diverse real-world benchmarks demonstrate the effectiveness of our proposed surrogate modeling approach over state-of-the-art BO methods.

翻译：我们考虑了在高维组合式空间优化昂贵黑箱功能的问题,这些问题出现在许多科学、工程和ML应用中。我们使用巴伊西亚优化(BO)并提出新的替代模型方法,以便有效地处理大量的二进制和绝对参数。关键的想法是从输入空间(字典)中选择一些离散的结构,并用它们来定义高维组合结构的圆形嵌入。这使我们能够使用现有的高森连续空间进程模型。我们制定了基于二进制波束的原则方法,以构建二进制空间的词典,并提出一种随机化的构建方法,将其概括到绝对空间。我们提供了理论上的理由来支持基于字典的嵌入的有效性。我们关于不同现实世界基准的实验展示了我们提议的替代模型方法相对于新式BO方法的有效性。</s>