The need for efficient computational screening of molecular candidates that possess desired properties frequently arises in various scientific and engineering problems, including drug discovery and materials design. However, the large size of the search space containing the candidates and the substantial computational cost of high-fidelity property prediction models makes screening practically challenging. In this work, we propose a general framework for constructing and optimizing a virtual screening (HTVS) pipeline that consists of multi-fidelity models. The central idea is to optimally allocate the computational resources to models with varying costs and accuracy to optimize the return-on-computational-investment (ROCI). Based on both simulated as well as real data, we demonstrate that the proposed optimal HTVS framework can significantly accelerate screening virtually without any degradation in terms of accuracy. Furthermore, it enables an adaptive operational strategy for HTVS, where one can trade accuracy for efficiency.
翻译:各种科学和工程问题,包括药物发现和材料设计问题,都经常需要有效计算筛选具有所需特性的分子候选人,但是,由于候选人的搜索空间大,而且高忠诚度财产预测模型的计算成本高,因此筛选工作具有实际挑战性;在这项工作中,我们提出一个总体框架,用于建造和优化由多种忠诚模式组成的虚拟筛选管道(HTVS),中心思想是将计算资源最佳地分配给成本和准确度不一的模型,以优化回报-消费-投资(ROCI),根据模拟数据和真实数据,我们证明拟议的最佳HTVS框架可以大大加快筛选,而不会在准确性方面出现任何退化;此外,它还为HTVS提供了一个适应性业务战略,在这种战略中,人们可以交易准确性以提高效率。