We introduce region-based explanations (RbX), a novel, model-agnostic method to generate local explanations of scalar outputs from a black-box prediction model using only query access. RbX is based on a greedy algorithm for building a convex polytope that approximates a region of feature space where model predictions are close to the prediction at some target point. This region is fully specified by the user on the scale of the predictions, rather than on the scale of the features. The geometry of this polytope - specifically the change in each coordinate necessary to escape the polytope - quantifies the local sensitivity of the predictions to each of the features. These "escape distances" can then be standardized to rank the features by local importance. RbX is guaranteed to satisfy a "sparsity axiom," which requires that features which do not enter into the prediction model are assigned zero importance. At the same time, real data examples and synthetic experiments show how RbX can more readily detect all locally relevant features than existing methods.
翻译:我们采用了基于区域的解释(RbX),这是一种新颖的模型-不可知性方法,用仅使用查询访问的黑盒预测模型对黑盒预测模型的卡路里输出进行局部解释。RbX基于一种贪婪的算法,用于建造一个相形色色的聚点,接近某一目标点时模型预测接近预测的特征空间区域。这个区域完全由用户根据预测的规模而不是根据特征的规模来指定。这个聚点的几何学——具体地说,为躲避聚点而需要的每个坐标的改变——量化了预测对每个特征的局部敏感性。这些“相貌距离”随后可以标准化,按当地重要性排列特征。RbX保证满足“等同性轴”的要求,没有进入预测模型的特征被赋予零份重要性。同时,真实的数据实例和合成实验表明RbX能够比现有方法更方便地探测所有与当地有关的特征。