深层学习方法是否真的在分子形成过程中表现更好? (Do Deep Learning Methods Really Perform Better in Molecular Conformation Generation?)

Molecular conformation generation (MCG) is a fundamental and important problem in drug discovery. Many traditional methods have been developed to solve the MCG problem, such as systematic searching, model-building, random searching, distance geometry, molecular dynamics, Monte Carlo methods, etc. However, they have some limitations depending on the molecular structures. Recently, there are plenty of deep learning based MCG methods, which claim they largely outperform the traditional methods. However, to our surprise, we design a simple and cheap algorithm (parameter-free) based on the traditional methods and find it is comparable to or even outperforms deep learning based MCG methods in the widely used GEOM-QM9 and GEOM-Drugs benchmarks. In particular, our design algorithm is simply the clustering of the RDKIT-generated conformations. We hope our findings can help the community to revise the deep learning methods for MCG. The code of the proposed algorithm could be found at https://gist.github.com/ZhouGengmo/5b565f51adafcd911c0bc115b2ef027c.

翻译：分子相容生成(MCG)是药物发现中的一个根本性和重要问题。许多传统方法,如系统搜索、建模、随机搜索、远程几何、分子动态、蒙特卡洛方法等,已经开发出许多传统方法来解决MCG问题。然而,它们因分子结构的不同而有一些局限性。最近,大量基于深层次学习的MCG方法声称它们大大超过传统方法。然而,我们感到惊讶的是,我们根据传统方法设计了一个简单而廉价的算法(无参数),发现它与广泛使用的GEOM-QM9和GEOM-Drugs基准中基于深层学习的MCGM方法相似或甚至优于这些方法。特别是,我们的设计算法只是RDKIT生成的组合。我们希望我们的调查结果能够帮助社区修改MG的深层学习方法。拟议的算法的代码可以在https://gist.github.com/ZhouGengmo/555f51511c10cbc1027Fef10F.