The increasing availability of structured but high dimensional data has opened new opportunities for optimization. One emerging and promising avenue is the exploration of unsupervised methods for projecting structured high dimensional data into low dimensional continuous representations, simplifying the optimization problem and enabling the application of traditional optimization methods. However, this line of research has been purely methodological with little connection to the needs of practitioners so far. In this paper, we study the effect of different search space design choices for performing Bayesian Optimization in high dimensional structured datasets. In particular, we analyse the influence of the dimensionality of the latent space, the role of the acquisition function and evaluate new methods to automatically define the optimization bounds in the latent space. Finally, based on experimental results using synthetic and real datasets, we provide recommendations for the practitioners.
翻译:结构化但高维数据的日益可得性为优化提供了新的机会,一个新兴和有希望的途径是探索未经监督的将结构化高维数据投射为低维连续表示方式的方法,简化优化问题并使传统优化方法得以应用。然而,这一研究线纯粹是方法,与从业人员的需求迄今几乎没有多少联系。在本文件中,我们研究了在高维结构数据集中进行巴耶西亚最佳化的不同搜索空间设计选择的影响。特别是,我们分析了潜在空间的维度的影响、获取功能的作用以及自动界定潜在空间优化界限的新方法。最后,我们根据利用合成和真实数据集的实验结果,向从业人员提出建议。