The successful application of machine learning (ML) methods becomes increasingly dependent on their interpretability or explainability. Designing explainable ML systems is instrumental to ensuring transparency of automated decision-making that targets humans. The explainability of ML methods is also an essential ingredient for trustworthy artificial intelligence. A key challenge in ensuring explainability is its dependence on the specific human user ("explainee"). The users of machine learning methods might have vastly different background knowledge about machine learning principles. One user might have a university degree in machine learning or related fields, while another user might have never received formal training in high-school mathematics. This paper applies information-theoretic concepts to develop a novel measure for the subjective explainability of the predictions delivered by a ML method. We construct this measure via the conditional entropy of predictions, given a user feedback. The user feedback might be obtained from user surveys or biophysical measurements. Our main contribution is the explainable empirical risk minimization (EERM) principle of learning a hypothesis that optimally balances between the subjective explainability and risk. The EERM principle is flexible and can be combined with arbitrary machine learning models. We present several practical implementations of EERM for linear models and decision trees. Numerical experiments demonstrate the application of EERM to detecting the use of inappropriate language on social media.
翻译:机械学习方法的成功应用越来越取决于机器学习方法的可解释性或可解释性。设计可解释性 ML 系统有助于确保针对人类的自动化决策的透明度。ML 方法的可解释性也是值得信赖的人工智能的一个基本要素。确保解释性的一个关键挑战是它依赖特定的人类用户(“解释者”)。机器学习方法的用户可能具有关于机器学习原则的巨大差异背景知识。一个用户可能拥有机器学习或相关领域的大学学位,而另一个用户可能从未接受过高中数学的正式培训。本文应用信息理论概念来为ML方法提供的预测的可主观解释性制定新的措施。我们根据用户反馈,通过有条件的预测酶构建这一措施。用户反馈可能来自用户调查或生物物理测量。我们的主要贡献是:可以解释的实验风险最小化原则,即学习一种假设,即主观解释性和风险之间的最佳平衡。EERM 原则是灵活的,可以与任意的机器学习模型相结合。我们通过有条件的预测来构建这一措施,我们根据用户反馈,通过附带条件的预测的灵验测ERM 的 ERM 的直线式模型,我们演示了E RM 的E 。