Complex prediction models such as deep learning are the output from fitting machine learning, neural networks, or AI models to a set of training data. These are now standard tools in science. A key challenge with the current generation of models is that they are highly parameterized, which makes describing and interpreting the prediction strategies difficult. We use topological data analysis to transform these complex prediction models into pictures representing a topological view. The result is a map of the predictions that enables inspection. The methods scale up to large datasets across different domains and enable us to detect labeling errors in training data, understand generalization in image classification, and inspect predictions of likely pathogenic mutations in the BRCA1 gene.
翻译:深层学习等复杂预测模型是从安装机器学习、神经网络或AI模型到一套培训数据的结果。这些模型现在是科学方面的标准工具。当前生成模型面临的一个关键挑战是,这些模型具有高度的参数化,因此难以描述和解释预测战略。我们利用地貌学数据分析将这些复杂的预测模型转换成代表地形学观点的图象。结果绘制了能够进行检查的预测图。这些方法扩大到了不同领域的大型数据集,使我们能够探测培训数据中的标签错误,了解图像分类的一般化,并检查BRCA1基因中可能发生病原突变的预测。