Scientists frequently prioritize learning from data rather than training the best possible model; however, research in machine learning often prioritizes the latter. The development of marginal feature importance methods, such as marginal contribution feature importance, attempts to break this trend by providing a useful framework for explaining relationships in data in an interpretable fashion. In this work, we generalize the framework of marginal contribution feature importance to improve performance with regards to detecting correlated interactions and reducing runtime. To do so, we consider "information subsets" of the set of features $F$ and show that our importance metric can be computed directly after applying fair representation learning methods from the AI fairness literature. The methods of optimal transport and linear regression are considered and explored experimentally for removing all the information of our feature of interest $f$ from the feature set $F$. Given these implementations, we show on real and simulated data that ultra marginal feature importance performs at least as well as marginal contribution feature importance, with substantially faster computation time and better performance in the presence of correlated interactions and unrelated features.
翻译:开发边际特征重要方法,例如边际贡献的重要性,试图通过提供一个有用的框架来解释数据关系,从而打破这一趋势。在这项工作中,我们概括了边际贡献框架的重要性,以改善在发现相关互动和缩短运行时间方面的绩效。为此,我们考虑一套特征的“信息子集”$F, 并表明在应用了来自AI公平文献的公平代表性学习方法后可以直接计算出我们的重要指标。考虑和探索了最佳运输方法和线性回归方法,以便从功能集中删除我们感兴趣的特征的所有信息。鉴于这些实施,我们展示了真正和模拟数据,超边际特征的重要性至少表现为边际贡献特征的重要性,在存在相关互动和不相关特征的情况下,计算时间大大加快,业绩更好。