Many crucial scientific problems involve designing novel molecules with desired properties, which can be formulated as an expensive black-box optimization problem over the discrete chemical space. Computational methods have achieved initial success but still struggle with simultaneously optimizing multiple competing properties in a sample-efficient manner. In this work, we propose a multi-objective Bayesian optimization (MOBO) algorithm leveraging the hypernetwork-based GFlowNets (HN-GFN) as an acquisition function optimizer, with the purpose of sampling a diverse batch of candidate molecular graphs from an approximate Pareto front. Using a single preference-conditioned hypernetwork, HN-GFN learns to explore various trade-offs between objectives. Inspired by reinforcement learning, we further propose a hindsight-like off-policy strategy to share high-performing molecules among different preferences in order to speed up learning for HN-GFN. Through synthetic experiments, we illustrate that HN-GFN has adequate capacity to generalize over preferences. Extensive experiments show that our framework outperforms the best baselines by a large margin in terms of hypervolume in various real-world MOBO settings.
翻译:许多关键的科学问题涉及设计具有理想特性的新分子,这些分子可以作为离散化学空间上昂贵的黑盒优化问题来设计。计算方法取得了初步成功,但仍在以抽样效率的方式同时优化多种相互竞争的特性。在这项工作中,我们提出一个多目标的巴伊西亚优化算法,利用以超网络为基础的GFlowNets(HN-GFN)作为获取功能优化器,目的是从近似Pareto前线取样一批不同的候选分子图。HN-GFN利用单一优惠条件的超高网络,学会探讨不同目标之间的各种取舍。在强化学习的启发下,我们进一步提议了一个类似于后视式的离政策战略,在不同偏好中分享高性分子,以便加速HN-GFN的学习。我们通过合成实验,说明HN-GFN有足够的能力来普遍推广偏好。广泛的实验表明,我们的框架在现实世界MOBO环境中的超升位上大大超越了最佳基线。