Motivated by neuroscientific and clinical applications, we empirically examine whether observational measures of information flow can suggest interventions. We do so by performing experiments on artificial neural networks in the context of fairness in machine learning, where the goal is to induce fairness in the system through interventions. Using our recently developed $M$-information flow framework, we measure the flow of information about the true label (responsible for accuracy, and hence desirable), and separately, the flow of information about a protected attribute (responsible for bias, and hence undesirable) on the edges of a trained neural network. We then compare the flow magnitudes against the effect of intervening on those edges by pruning. We show that pruning edges that carry larger information flows about the protected attribute reduces bias at the output to a greater extent. This demonstrates that $M$-information flow can meaningfully suggest targets for interventions, answering the title's question in the affirmative. We also evaluate bias-accuracy tradeoffs for different intervention strategies, to analyze how one might use estimates of desirable and undesirable information flows (here, accuracy and bias flows) to inform interventions that preserve the former while reducing the latter.
翻译:在神经科学和临床应用的激励下,我们从经验上研究信息流动观测措施是否可建议干预。我们这样做的方法是在机器学习公平性的背景下对人工神经网络进行实验,目的是通过干预实现系统的公平性。我们利用我们最近开发的美元-信息流动框架,衡量关于真实标签的信息流动(对准确性负责,因而是可取的),并单独衡量关于受过训练的神经网络边缘受保护属性的信息流动(对偏向负责,因而是不可取的)。然后,我们比较流动量与在这些边缘进行干涉的效果。我们显示,关于受保护属性的信息流动越多,就越能减少产出中的偏差。这表明,美元-信息流动可以有意义地建议干预目标,在肯定的题目问题上回答问题。我们还评估了不同干预战略的偏差性偏差取舍,以分析人们如何使用对理想和不良信息流动的估计(这里的准确性和偏差性流动),从而在更大程度上为干预行动提供信息提供信息提供信息提供信息,同时减少前一种干预措施。