Functional groups and moieties are chemical descriptors of biomolecules that can be used to interpret their properties and functions, leading to the understanding of chemical or biological mechanisms. These chemical building blocks, or sub-structures, enable the identification of common molecular subgroups, assessing the structural similarities and critical interactions among a set of biological molecules with known activities, and designing novel compounds with similar chemical properties. Here, we introduce a Python-based tool, SPECTRe (Substructure Processing, Enumeration, and Comparison Tool Resource), designed to provide all substructures in a given molecular structure, regardless of the molecule size, employing efficient enumeration and generation of substructures represented in a human-readable SMILES format through the use of classical graph traversal (breadth-first and depth-first search) algorithms. We demonstrate the application of SPECTRe for a set of 10,375 molecules in the molecular weight range 27 to 350 Da (<=26 non-hydrogen atoms), spanning a wide array of structure-based chemical functionalities and chemical classes. We found that the substructure count as a measure of molecular complexity depends strongly on the number of unique atom and bond types present, degree of branching, and presence of rings. The substructure counts are found to be similar for a set of molecules belonging to particular chemical classes and classified based on the characteristic features of certain topologies. We demonstrate that SPECTRe shows promise to be useful in many applications of cheminformatics such as virtual screening for drug discovery, property prediction, fingerprint-based molecular similarity searching, and data mining for identifying frequent substructures.
翻译:这些化学构件或子结构能够识别共同分子子分组,评估一组已知活动的生物分子之间的结构相似性和关键互动,设计具有类似化学特性的新化合物。在这里,我们引入了一个基于Python的工具,SPECTRe(结构处理、编号和比较工具资源),用于在特定分子结构中提供所有虚拟结构的子结构,而不论其分子大小,使用高效的清点和生成以人类可读的SMILES格式体现的亚结构,通过使用经典的图形曲线(第一级和深度搜索)算法,能够识别一组已知活动的生物分子分子分子分子之间的结构相似性和关键互动,并设计具有类似化学特性的新型化合物。我们展示了SPECTRE在分子重量27至350 Da( ⁇ 26非液态)的一组分子筛选中的应用,在基于结构的化学功能和化学类中进行广泛一系列的查找。我们发现,在可读的亚结构结构中的分类和分子结构结构结构结构中,作为可测量的精度的精度,在某类的分子精度上,在某类的分子精度的分类中,我们发现,在类的分子精度的精度的精度的精度的精度上,在类中的精度上,在类中可以测量到某类的分子精度的精度的精度的精度的精度上的精度的精度的精度。