Chemical toxicity prediction using machine learning is important in drug development to reduce repeated animal and human testing, thus saving cost and time. It is highly recommended that the predictions of computational toxicology models are mechanistically explainable. Current state of the art machine learning classifiers are based on deep neural networks, which tend to be complex and harder to interpret. In this paper, we apply a recently developed method named contrastive explanations method (CEM) to explain why a chemical or molecule is predicted to be toxic or not. In contrast to popular methods that provide explanations based on what features are present in the molecule, the CEM provides additional explanation on what features are missing from the molecule that is crucial for the prediction, known as the pertinent negative. The CEM does this by optimizing for the minimum perturbation to the model using a projected fast iterative shrinkage-thresholding algorithm (FISTA). We verified that the explanation from CEM matches known toxicophores and findings from other work.
翻译:使用机器学习进行化学毒性预测,对于减少反复的动物和人体试验,从而节省成本和时间的药物开发十分重要。高度建议计算毒理学模型的预测是机械上可以解释的。先进的机器学习分类器的目前状态是以深神经网络为基础的,这些网络往往复杂而难于解释。在本文中,我们采用了最近开发的称为对比解释方法(CEM)来解释一个化学或分子预计有毒或不有毒的原因。与根据分子中存在的特点提供解释的流行方法相反,CEM提供了对预测至关重要的分子所缺少的特征的补充解释,即相关的负值。CEM这样做的方式是利用预测的快速迭接缩算法(FISTA)优化对模型的最小扰动。我们核实了CEM的解释与已知的毒药类动物和其他工作结果的匹配。