Rapid discovery of new reactions and molecules in recent years has been facilitated by the advancements in high throughput screening, accessibility to a much more complex chemical design space, and the development of accurate molecular modeling frameworks. A holistic study of the growing chemistry literature is, therefore, required that focuses on understanding the recent trends and extrapolating them into possible future trajectories. To this end, several network theory-based studies have been reported that use a directed graph representation of chemical reactions. Here, we perform a study based on representing chemical reactions as hypergraphs where the hyperedges represent chemical reactions and nodes represent the participating molecules. We use a standard reactions dataset to construct a hypernetwork and report its statistics such as degree distributions, average path length, assortativity or degree correlations, PageRank centrality, and graph-based clusters (or communities). We also compute each statistic for an equivalent directed graph representation of reactions to draw parallels and highlight differences between the two. To demonstrate the AI applicability of hypergraph reaction representation, we generate dense hypergraph embeddings and use them in the reaction classification problem. We conclude that the hypernetwork representation is flexible, preserves reaction context, and uncovers hidden insights that are otherwise not apparent in a traditional directed graph representation of chemical reactions.
翻译:近年来,由于在高吞吐量筛选、进入复杂得多的化学设计空间和开发准确的分子模型框架方面的进展,促进了新反应和分子的迅速发现。因此,需要对不断增长的化学文献进行全面研究,重点是了解最近的趋势,并将这些趋势推入未来的轨道。为此目的,一些网络理论研究报告说,它们使用化学反应的定向图解表示化学反应。在这里,我们进行一项研究,将化学反应作为高端代表化学反应和节点代表参与的分子的超高射线表示。我们使用标准反应数据集构建一个超高网络并报告其统计数据,如度分布、平均路径长度、异位或度相关性、PageRank中心点和基于图表的集群(或社区)等。我们还计算了每项统计数据,以对应的定向图表表示化学反应,以平行显示和突出两者之间的差异。为了显示超强射量反应代表的AI可应用性,我们生成密集的高射线嵌入并在反应分类问题上使用这些数据。我们用标准反应数据集来构建一个超强反应数据集,报告其统计数据,例如度分布、平均路径度或直观的图像反映。我们的结论是,高射式的图像代表是隐化反应。