While previous distribution shift detection approaches can identify if a shift has occurred, these approaches cannot localize which specific features have caused a distribution shift -- a critical step in diagnosing or fixing any underlying issue. For example, in military sensor networks, users will want to detect when one or more of the sensors has been compromised, and critically, they will want to know which specific sensors might be compromised. Thus, we first define a formalization of this problem as multiple conditional distribution hypothesis tests and propose both non-parametric and parametric statistical tests. For both efficiency and flexibility, we then propose to use a test statistic based on the density model score function (i.e. gradient with respect to the input) -- which can easily compute test statistics for all dimensions in a single forward and backward pass. Any density model could be used for computing the necessary statistics including deep density models such as normalizing flows or autoregressive models. We additionally develop methods for identifying when and where a shift occurs in multivariate time-series data and show results for multiple scenarios using realistic attack models on both simulated and real world data.
翻译:虽然先前的分布转移检测方法可以确定是否发生了转变,但这些方法无法确定哪些具体特征导致了分布转移 -- -- 这是诊断或解决任何基本问题的关键步骤。例如,在军事传感器网络中,用户希望当一个或多个传感器受损时发现,而且关键地是,他们希望知道哪些传感器可能受损。因此,我们首先将这一问题正规化定义为多重有条件分布假设测试,并提议非参数和参数统计测试。为了效率和灵活性,我们然后提议使用基于密度模型评分函数(即投入的梯度)的测试统计 -- -- 这很容易在单一前向和后向传递中计算所有维度的测试统计数据。任何密度模型都可用于计算必要的统计数据,包括深度密度模型,如正常流动或自动反向模型。我们进一步制定方法,用以确定多变时间序列数据何时何地发生转移,并利用模拟数据和真实世界数据的实际攻击模型显示多种情景的结果。