With the growing availability of data within various scientific domains, generative models hold enormous potential to accelerate scientific discovery at every step of the scientific method. Perhaps their most valuable application lies in the speeding up of what has traditionally been the slowest and most challenging step of coming up with a hypothesis. Powerful representations are now being learned from large volumes of data to generate novel hypotheses, which is making a big impact on scientific discovery applications ranging from material design to drug discovery. The GT4SD (https://github.com/GT4SD/gt4sd-core) is an extensible open-source library that enables scientists, developers and researchers to train and use state-of-the-art generative models for hypothesis generation in scientific discovery. GT4SD supports a variety of uses of generative models across material science and drug discovery, including molecule discovery and design based on properties related to target proteins, omic profiles, scaffold distances, binding energies and more.
翻译:随着各种科学领域数据不断增多,基因模型具有在科学方法的每一个阶段加速科学发现的巨大潜力,或许最宝贵的应用在于加快传统上最缓慢和最具挑战性的假设步骤,现在正在从大量数据中学习强大的表现,以产生新的假设,这对从材料设计到药物发现等科学发现应用产生了巨大影响。GT4SD(https://github.com/GT4SD/gt4sd-core)是一个可扩展的开放源码图书馆,使科学家、开发者和研究人员能够培训和使用最新的基因化模型,用于科学发现中的假设生成。GT4SD支持在材料科学和药物发现中使用各种基因化模型,包括基于目标蛋白质、肿瘤剖面图、骨架距离、紧凑能量等特性的分子发现和设计。