We propose a novel surrogate-assisted Evolutionary Algorithm for solving expensive combinatorial optimization problems. We integrate a surrogate model, which is used for fitness value estimation, into a state-of-the-art P3-like variant of the Gene-Pool Optimal Mixing Algorithm (GOMEA) and adapt the resulting algorithm for solving non-binary combinatorial problems. We test the proposed algorithm on an ensemble learning problem. Ensembling several models is a common Machine Learning technique to achieve better performance. We consider ensembles of several models trained on disjoint subsets of a dataset. Finding the best dataset partitioning is naturally a combinatorial non-binary optimization problem. Fitness function evaluations can be extremely expensive if complex models, such as Deep Neural Networks, are used as learners in an ensemble. Therefore, the number of fitness function evaluations is typically limited, necessitating expensive optimization techniques. In our experiments we use five classification datasets from the OpenML-CC18 benchmark and Support-vector Machines as learners in an ensemble. The proposed algorithm demonstrates better performance than alternative approaches, including Bayesian optimization algorithms. It manages to find better solutions using just several thousand fitness function evaluations for an ensemble learning problem with up to 500 variables.
翻译:我们提出一个新颖的代谢辅助进化解算法,用于解决昂贵的组合优化问题。 我们将一个用于健康价值估计的替代模型纳入基因- pool 最佳混合成像 Algorithm (GOMAA) 的最先进的P3类变体中,并调整由此产生的算法,以解决非二进制组合问题。 我们用一个共同的学习问题来测试拟议算法。 组合几个模型是一种共同的机器学习技术, 以取得更好的性能。 我们考虑将一些在数据集分解子组方面受过训练的模型组合成一组。 找到最好的数据集分割是自然的组合式非二进制优化问题。 如果将深神经网络等复杂模型用作一个共构件的学习者, 则适合性功能评估费用极高。 因此, 健康功能评价的数量通常有限, 需要昂贵的优化技术。 我们实验中使用了五种分类数据集, 从 OpenML- CC18 基准中找到, 支持- 控理机机组作为学习者, 一种更好的学习工具, 展示一种更好的模型, 包括更精确的演算方法。