双重依赖下矩阵值数据大型多重测试 (Large-Scale Multiple Testing for Matrix-Valued Data under Double Dependency)

High-dimensional inference based on matrix-valued data has drawn increasing attention in modern statistical research, yet not much progress has been made in large-scale multiple testing specifically designed for analysing such data sets. Motivated by this, we consider in this article an electroencephalography (EEG) experiment that produces matrix-valued data and presents a scope of developing novel matrix-valued data based multiple testing methods controlling false discoveries for hypotheses that are of importance in such an experiment. The row-column cross-dependency of observations appearing in a matrix form, referred to as double-dependency, is one of the main challenges in the development of such methods. We address it by assuming matrix normal distribution for the observations at each of the independent matrix data-points. This allows us to fully capture the underlying double-dependency informed through the row- and column-covariance matrices and develop methods that are potentially more powerful than the corresponding one (e.g., Fan and Han (2017)) obtained by vectorizing each data point and thus ignoring the double-dependency. We propose two methods to approximate the false discovery proportion with statistical accuracy. While one of these methods is a general approach under double-dependency, the other one provides more computational efficiency for higher dimensionality. Extensive numerical studies illustrate the superior performance of the proposed methods over the principal factor approximation method of Fan and Han (2017). The proposed methods have been further applied to the aforementioned EEG data.

翻译：以矩阵价值数据为基础的高度推论在现代统计研究中引起越来越多的注意,然而,在专门设计用于分析这类数据集的大规模多重测试方面没有取得多大进展,为此,我们在本篇文章中认为电子脑学试验,该试验产生矩阵价值数据,并展示了开发新型矩阵价值数据基于数据的数据的多种测试方法的范围,以控制在这种实验中十分重要的假设的虚假发现。以矩阵形式出现的、称为 " 双重依赖 " 的观测,是开发这类方法的主要挑战之一。我们假设在独立矩阵数据点的每个观测点的矩阵分布正常,从而解决这一问题。这使我们能够充分捕捉到通过行和柱值变量矩阵了解的基本双重依赖性数据,并制定可能比相应方法(例如,范和汉(2017年)更强大的方法。通过对每个数据点进行矢量化,从而忽略双重依赖,这是制定这种方法的主要挑战之一。我们提议采用两种方法,即假设在每一个独立的矩阵数据基点上进行正常分布比例的矩阵方法,同时提供一种更精确性统计方法。