Traditional machine learning (ML) algorithms, such as multiple regression, require human analysts to make decisions on how to treat the data. These decisions can make the model building process subjective and difficult to replicate for those who did not build the model. Deep learning approaches benefit by allowing the model to learn what features are important once the human analyst builds the architecture. Thus, a method for automating certain human decisions for traditional ML modeling would help to improve the reproducibility and remove subjective aspects of the model building process. To that end, we propose to use shape metrics to describe 2D data to help make analyses more explainable and interpretable. The proposed approach provides a foundation to help automate various aspects of model building in an interpretable and explainable fashion. This is particularly important in applications in the medical community where the `right to explainability' is crucial. We provide various simulated data sets ranging from probability distributions, functions, and model quality control checks (such as QQ-Plots and residual analyses from ordinary least squares) to showcase the breadth of this approach.
翻译:传统的机器学习(ML)算法,如多重回归,要求人类分析师就如何处理数据作出决定。这些决定可以使模型建设过程主观化,对于没有建立模型的人来说很难复制。深度学习方法有利于让模型在人类分析师建立架构后了解哪些特征是重要的。因此,将某些人类决定自动化用于传统的模型建设过程的方法将有助于改进模型建设过程的可复制性,并消除主观方面。为此,我们提议使用形状度量来描述2D数据,以帮助使分析更加可解释和可解释。拟议方法提供了一个基础,帮助模型建设的各个方面以可解释和可解释的方式自动化。这对于医学界的应用尤其重要,因为“解释权”在医疗界至关重要。我们提供了各种模拟数据集,从概率分布、功能和模型质量控制检查(例如“-绘图”和从普通最不平方的残余分析)到展示这一方法的广度。