Model interpretations are often used in practice to extract real world insights from machine learning models. These interpretations have a wide range of applications; they can be presented as business recommendations or used to evaluate model bias. It is vital for a data scientist to choose trustworthy interpretations to drive real world impact. Doing so requires an understanding of how the accuracy of a model impacts the quality of standard interpretation tools. In this paper, we will explore how a model's predictive accuracy affects interpretation quality. We propose two metrics to quantify the quality of an interpretation and design an experiment to test how these metrics vary with model accuracy. We find that for datasets that can be modeled accurately by a variety of methods, simpler methods yield higher quality interpretations. We also identify which interpretation method works the best for lower levels of model accuracy.
翻译:模型解释常常在实践中用于从机器学习模型中提取真实世界的真知灼见,这些解释具有广泛的应用;可以作为商业建议提出,或者用来评价模型偏差。数据科学家必须选择可靠的解释,以产生真实的世界影响。这样做需要理解模型的准确性如何影响标准解释工具的质量。在本文中,我们将探讨模型的预测准确性如何影响解释质量。我们提出两个衡量标准,以量化解释质量,并设计一个试验,以测试这些指标如何与模型准确性发生差异。我们发现,对于可以通过各种方法准确建模的数据集,更简单的方法会产生更高质量的解释。我们还确定了哪些解释方法最有利于较低的模型准确性。