There are numerous methods for detecting anomalies in time series, but that is only the first step to understanding them. We strive to exceed this by explaining those anomalies. Thus we develop a novel attribution scheme for multivariate time series relying on counterfactual reasoning. We aim to answer the counterfactual question of would the anomalous event have occurred if the subset of the involved variables had been more similarly distributed to the data outside of the anomalous interval. Specifically, we detect anomalous intervals using the Maximally Divergent Interval (MDI) algorithm, replace a subset of variables with their in-distribution values within the detected interval and observe if the interval has become less anomalous, by re-scoring it with MDI. We evaluate our method on multivariate temporal and spatio-temporal data and confirm the accuracy of our anomaly attribution of multiple well-understood extreme climate events such as heatwaves and hurricanes.

### 相关内容

Network meta-analysis (NMA) of rare events has attracted little attention in the literature. Until recently, networks of interventions with rare events were analyzed using the inverse-variance NMA approach. However, when events are rare the normal approximation made by this model can be poor and effect estimates are potentially biased. Other methods for the synthesis of such data are the recent extension of the Mantel-Haenszel approach to NMA or the use of the non-central hypergeometric distribution. In this article, we suggest a new common-effect NMA approach that can be applied even in networks of interventions with extremely low or even zero number of events without requiring study exclusion or arbitrary imputations. Our method is based on the implementation of the penalized likelihood function proposed by Firth for bias reduction of the maximum likelihood estimate to the logistic expression of the NMA model. A limitation of our method is that heterogeneity cannot be taken into account as an additive parameter as in most meta-analytical models. However, we account for heterogeneity by incorporating a multiplicative overdispersion term using a two-stage approach. We show through simulations that our method performs consistently well across all tested scenarios and most often results in smaller bias than other available methods. We also illustrate the use of our method through two clinical examples. We conclude that our "penalized likelihood NMA" approach is promising for the analysis of binary outcomes with rare events especially for networks with very few studies per comparison and very low control group risks.

In many applications, it is important to be able to explain the decisions of machine learning systems. An increasingly popular approach has been to seek to provide \emph{counterfactual instance explanations}. These specify close possible worlds in which, contrary to the facts, a person receives their desired decision from the machine learning system. This paper will draw on literature from the philosophy of science to argue that a satisfactory explanation must consist of both counterfactual instances and a causal equation (or system of equations) that support the counterfactual instances. We will show that counterfactual instances by themselves explain little. We will further illustrate how explainable AI methods that provide both causal equations and counterfactual instances can successfully explain machine learning predictions.

Various goodness-of-fit tests are designed based on the so-called information matrix equivalence: if the assumed model is correctly specified, two information matrices that are derived from the likelihood function are equivalent. In the literature, this principle has been established for the likelihood function with fully observed data, but it has not been verified under the likelihood for censored data. In this manuscript, we prove the information matrix equivalence in the framework of semiparametric copula models for multivariate censored survival data. Based on this equivalence, we propose an information ratio (IR) test for the specification of the copula function. The IR statistic is constructed via comparing consistent estimates of the two information matrices. We derive the asymptotic distribution of the IR statistic and propose a parametric bootstrap procedure for the finite-sample $P$-value calculation. The performance of the IR test is investigated via a simulation study and a real data example.

In paired design studies, it is common to have multiple measurements taken for the same set of subjects under different conditions. In observational studies, it is many times of interest to conduct pair matching on multiple covariates between a treatment group and a control group, and to test the treatment effect represented by multiple response variables on well pair-matched data. However, there is a lack of an effective test on multivariate paired data. The multivariate paired Hotelling's $T^2$ test can sometimes be used, but its power decreases fast as the dimension increases. Existing methods for assessing the balance of multiple covariates in matched observational studies usually ignore the paired structure and thus they do not perform well under some settings. In this work, we propose a new non-parametric test for paired data, which exhibits a substantial power improvement over existing methods under a wide range of situations. We also derive the asymptotic distribution of the new test and the approximate $p$-value is reasonably accurate under finite samples through simulation studies even when the dimension is larger than the sample size, making the new test an easy-off-the-shelf tool for real applications. The proposed test is illustrated through an analysis of a real data set on the Alzheimer's disease research.

In the first part of this work, we develop a novel scheme for solving nonparametric regression problems. That is the approximation of possibly low regular and noised functions from the knowledge of their approximate values given at some random points. Our proposed scheme is based on the use of the pseudo-inverse of a random projection matrix, combined with some specific properties of the Jacobi polynomials system, as well as some properties of positive definite random matrices. This scheme has the advantages to be stable, robust, accurate and fairly fast in terms of execution time. Moreover and unlike most of the existing nonparametric regression estimators, no extra regularization step is required by our proposed estimator. Although, this estimator is initially designed to work with random sampling set of uni-variate i.i.d. random variables following a Beta distribution, we show that it is still work for a wide range of sampling distribution laws. Moreover, we briefly describe how our estimator can be adapted in order to handle the multivariate case of random sampling sets. In the second part of this work, we extend the random pseudo-inverse scheme technique to build a stable and accurate estimator for solving linear functional regression (LFR) problems. A dyadic decomposition approach is used to construct this last stable estimator for the LFR problem. The performance of the two proposed estimators are illustrated by various numerical simulations. In particular, a real dataset is used to illustrate the performance of our nonparametric regression estimator.

Centralized Training with Decentralized Execution (CTDE) has been a popular paradigm in cooperative Multi-Agent Reinforcement Learning (MARL) settings and is widely used in many real applications. One of the major challenges in the training process is credit assignment, which aims to deduce the contributions of each agent according to the global rewards. Existing credit assignment methods focus on either decomposing the joint value function into individual value functions or measuring the impact of local observations and actions on the global value function. These approaches lack a thorough consideration of the complicated interactions among multiple agents, leading to an unsuitable assignment of credit and subsequently mediocre results on MARL. We propose Shapley Counterfactual Credit Assignment, a novel method for explicit credit assignment which accounts for the coalition of agents. Specifically, Shapley Value and its desired properties are leveraged in deep MARL to credit any combinations of agents, which grants us the capability to estimate the individual credit for each agent. Despite this capability, the main technical difficulty lies in the computational complexity of Shapley Value who grows factorially as the number of agents. We instead utilize an approximation method via Monte Carlo sampling, which reduces the sample complexity while maintaining its effectiveness. We evaluate our method on StarCraft II benchmarks across different scenarios. Our method outperforms existing cooperative MARL algorithms significantly and achieves the state-of-the-art, with especially large margins on tasks with more severe difficulties.

Existing work in counterfactual Learning to Rank (LTR) has focussed on optimizing feature-based models that predict the optimal ranking based on document features. LTR methods based on bandit algorithms often optimize tabular models that memorize the optimal ranking per query. These types of model have their own advantages and disadvantages. Feature-based models provide very robust performance across many queries, including those previously unseen, however, the available features often limit the rankings the model can predict. In contrast, tabular models can converge on any possible ranking through memorization. However, memorization is extremely prone to noise, which makes tabular models reliable only when large numbers of user interactions are available. Can we develop a robust counterfactual LTR method that pursues memorization-based optimization whenever it is safe to do? We introduce the Generalization and Specialization (GENSPEC) algorithm, a robust feature-based counterfactual LTR method that pursues per-query memorization when it is safe to do so. GENSPEC optimizes a single feature-based model for generalization: robust performance across all queries, and many tabular models for specialization: each optimized for high performance on a single query. GENSPEC uses novel relative high-confidence bounds to choose which model to deploy per query. By doing so, GENSPEC enjoys the high performance of successfully specialized tabular models with the robustness of a generalized feature-based model. Our results show that GENSPEC leads to optimal performance on queries with sufficient click data, while having robust behavior on queries with little or noisy data.

Modeling multivariate time series has long been a subject that has attracted researchers from a diverse range of fields including economics, finance, and traffic. A basic assumption behind multivariate time series forecasting is that its variables depend on one another but, upon looking closely, it is fair to say that existing methods fail to fully exploit latent spatial dependencies between pairs of variables. In recent years, meanwhile, graph neural networks (GNNs) have shown high capability in handling relational dependencies. GNNs require well-defined graph structures for information propagation which means they cannot be applied directly for multivariate time series where the dependencies are not known in advance. In this paper, we propose a general graph neural network framework designed specifically for multivariate time series data. Our approach automatically extracts the uni-directed relations among variables through a graph learning module, into which external knowledge like variable attributes can be easily integrated. A novel mix-hop propagation layer and a dilated inception layer are further proposed to capture the spatial and temporal dependencies within the time series. The graph learning, graph convolution, and temporal convolution modules are jointly learned in an end-to-end framework. Experimental results show that our proposed model outperforms the state-of-the-art baseline methods on 3 of 4 benchmark datasets and achieves on-par performance with other approaches on two traffic datasets which provide extra structural information.

The prevalence of networked sensors and actuators in many real-world systems such as smart buildings, factories, power plants, and data centers generate substantial amounts of multivariate time series data for these systems. The rich sensor data can be continuously monitored for intrusion events through anomaly detection. However, conventional threshold-based anomaly detection methods are inadequate due to the dynamic complexities of these systems, while supervised machine learning methods are unable to exploit the large amounts of data due to the lack of labeled data. On the other hand, current unsupervised machine learning approaches have not fully exploited the spatial-temporal correlation and other dependencies amongst the multiple variables (sensors/actuators) in the system for detecting anomalies. In this work, we propose an unsupervised multivariate anomaly detection method based on Generative Adversarial Networks (GANs). Instead of treating each data stream independently, our proposed MAD-GAN framework considers the entire variable set concurrently to capture the latent interactions amongst the variables. We also fully exploit both the generator and discriminator produced by the GAN, using a novel anomaly score called DR-score to detect anomalies by discrimination and reconstruction. We have tested our proposed MAD-GAN using two recent datasets collected from real-world CPS: the Secure Water Treatment (SWaT) and the Water Distribution (WADI) datasets. Our experimental results showed that the proposed MAD-GAN is effective in reporting anomalies caused by various cyber-intrusions compared in these complex real-world systems.

Methods that align distributions by minimizing an adversarial distance between them have recently achieved impressive results. However, these approaches are difficult to optimize with gradient descent and they often do not converge well without careful hyperparameter tuning and proper initialization. We investigate whether turning the adversarial min-max problem into an optimization problem by replacing the maximization part with its dual improves the quality of the resulting alignment and explore its connections to Maximum Mean Discrepancy. Our empirical results suggest that using the dual formulation for the restricted family of linear discriminators results in a more stable convergence to a desirable solution when compared with the performance of a primal min-max GAN-like objective and an MMD objective under the same restrictions. We test our hypothesis on the problem of aligning two synthetic point clouds on a plane and on a real-image domain adaptation problem on digits. In both cases, the dual formulation yields an iterative procedure that gives more stable and monotonic improvement over time.

Theodoros Evrenoglou,Ian White,Sivem Afach,Dimitris Mavridis,Anna Chaimani
0+阅读 · 9月21日
0+阅读 · 9月20日
Jingru Zhang,Hao Chen,Xiao-Hua Zhou
0+阅读 · 9月19日
Jiahui Li,Kun Kuang,Baoxiang Wang,Furui Liu,Long Chen,Fei Wu,Jun Xiao
6+阅读 · 6月22日
Zonghan Wu,Shirui Pan,Guodong Long,Jing Jiang,Xiaojun Chang,Chengqi Zhang
26+阅读 · 2020年5月24日
Dan Li,Dacheng Chen,Lei Shi,Baihong Jin,Jonathan Goh,See-Kiong Ng
10+阅读 · 2019年1月15日
Ben Usman,Kate Saenko,Brian Kulis
3+阅读 · 2018年1月30日

22+阅读 · 7月16日

60+阅读 · 2019年10月12日

42+阅读 · 2019年10月10日

16+阅读 · 2019年10月9日

CreateAMind
6+阅读 · 2019年1月18日
CreateAMind
7+阅读 · 2019年1月7日
CreateAMind
29+阅读 · 2019年1月3日
CreateAMind
7+阅读 · 2018年12月10日
CreateAMind
23+阅读 · 2018年9月12日
CreateAMind
3+阅读 · 2018年4月15日

6+阅读 · 2017年10月18日

3+阅读 · 2017年8月6日
CreateAMind
5+阅读 · 2017年8月4日
CreateAMind
9+阅读 · 2017年7月21日
Top