The successful application of machine learning (ML) methods becomes increasingly dependent on their interpretability or explainability. Designing explainable ML systems is instrumental to ensuring transparency of automated decision-making that targets humans. The explainability of ML methods is also an essential ingredient for trustworthy artificial intelligence. A key challenge in ensuring explainability is its dependence on the specific human user ("explainee"). The users of machine learning methods might have vastly different background knowledge about machine learning principles. One user might have a university degree in machine learning or related fields, while another user might have never received formal training in high-school mathematics. This paper applies information-theoretic concepts to develop a novel measure for the subjective explainability of the predictions delivered by a ML method. We construct this measure via the conditional entropy of predictions, given a user signal. This user signal might be obtained from user surveys or biophysical measurements. Our main contribution is the explainable empirical risk minimization (EERM) principle of learning a hypothesis that optimally balances between the subjective explainability and risk. The EERM principle is flexible and can be combined with arbitrary machine learning models. We present several practical implementations of EERM for linear models and decision trees. Numerical experiments demonstrate the application of EERM to detecting the use of inappropriate language on social media.
翻译:机械学习方法的成功应用越来越取决于其可解释性或可解释性。设计可解释性 ML 系统有助于确保针对人类的自动化决策的透明度。ML 方法的可解释性也是值得信赖的人工智能的一个基本要素。确保解释性的一个关键挑战是它依赖特定的人类用户(“解释者”)。机器学习方法的用户可能具有关于机器学习原则的巨大差异背景知识。一个用户可能拥有机器学习或相关领域的大学学位,而另一个用户可能从未接受过高中数学的正式培训。本文应用信息理论概念来为ML方法作出的预测的主观解释性制定新的措施。我们通过有条件的预测的灵敏度来构建这一措施。这个用户信号可以从用户调查或生物物理测量中获得。我们的主要贡献是,可以解释性地了解一种假设,即机能性解释性解释性与风险之间的最佳平衡。ERM原则是灵活的,并且可以与任意的机器学习模型相结合。我们通过有条件的预测性母体模型来构建这一措施。我们通过一个用户信号来构建这一措施。这个用户信号可以从用户调查或生物物理测量ERM 的直线式模型用于检测ERM 的磁树的实验。