In this paper, we introduce a novel method to generate interpretable regression function estimators. The idea is based on called data-dependent coverings. The aim is to extract from the data a covering of the feature space instead of a partition. The estimator predicts the empirical conditional expectation over the cells of the partitions generated from the coverings. Thus, such estimator has the same form as those issued from data-dependent partitioning algorithms. We give sufficient conditions to ensure the consistency, avoiding the sufficient condition of shrinkage of the cells that appears in the former literature. Doing so, we reduce the number of covering elements. We show that such coverings are interpretable and each element of the covering is tagged as significant or insignificant. The proof of the consistency is based on a control of the error of the empirical estimation of conditional expectations which is interesting on its own.
翻译:在本文中, 我们引入了一种新的方法来生成可解释的回归函数估计值。 理念基于所谓的数据依赖覆盖。 目的是从数据中提取特征空间的覆盖值, 而不是分割区。 估计值预测了从覆盖区中生成的分区单元格的实验性有条件期望值。 因此, 这样的估算值具有与数据依赖的分割算法所发布的数据相同的形式。 我们提供了足够的条件来确保一致性, 避免了先前文献中出现的单元格的足够缩缩缩条件。 我们这样做, 我们减少了覆盖元素的数量。 我们表明, 这样的覆盖值是可以解释的, 覆盖的每个元素都被标记为重要或不重要。 一致性的证据是基于对有条件期望的经验估计错误的控制, 而这种估计本身是有趣的。