The concern of overconfident mis-predictions under distributional shift demands extensive reliability research on Graph Neural Networks used in critical tasks in drug discovery. Here we first introduce CardioTox, a real-world benchmark on drug cardio-toxicity to facilitate such efforts. Our exploratory study shows overconfident mis-predictions are often distant from training data. That leads us to develop distance-aware GNNs: GNN-SNGP. Through evaluation on CardioTox and three established benchmarks, we demonstrate GNN-SNGP's effectiveness in increasing distance-awareness, reducing overconfident mis-predictions and making better calibrated predictions without sacrificing accuracy performance. Our ablation study further reveals the representation learned by GNN-SNGP improves distance-preservation over its base architecture and is one major factor for improvements.
翻译:在分布式转移中过于自信的错误预测问题要求对用于药物发现关键任务的图形神经网络进行广泛的可靠性研究。 首先,我们引入了CardioTox,这是一个关于药物的心毒性的现实世界基准,以促进这种努力。我们的探索性研究表明,过度自信的错误预测往往与培训数据相去甚远。这导致我们开发了远程GNN-SNGNNGNs:GNNN-SNGP。通过对CardioTox和三个既定基准的评估,我们展示了GNN-SNGP在提高远程意识、减少过度自信的错误预测和在不牺牲准确性能的情况下作出更精确的预测方面的有效性。我们的通缩性研究进一步揭示了GNN-SNGP在改进基本结构的远程保护方面所获得的代表性,并且是改进的主要因素之一。