Change-points are a routine feature of 'big data' observed in the form of high-dimensional data streams. In many such data streams, the component series possess group structures and it is natural to assume that changes only occur in a small number of all groups. We propose a new change point procedure, called 'groupInspect', that exploits the group sparsity structure to estimate a projection direction so as to aggregate information across the component series to successfully estimate the change-point in the mean structure of the series. We prove that the estimated projection direction is minimax optimal, up to logarithmic factors, when all group sizes are of comparable order. Moreover, our theory provide strong guarantees on the rate of convergence of the change-point location estimator. Numerical studies demonstrates the competitive performance of groupInspect in a wide range of settings and a real data example confirms the practical usefulness of our procedure.
翻译:变化点是以高维数据流的形式观察到的“ 大数据” 的例行特征。 在许多这样的数据流中, 组件序列拥有群体结构, 自然地假设只有少数组群才会发生变化。 我们提议了新的变化点程序, 叫做“ 群群点检查 ”, 利用群群点结构来估计一个预测方向, 从而将各组群的预测方向汇总起来, 从而成功地估计出该系列中平均结构的变化点。 我们证明, 估计的预测方向是最优化的, 最高为对数因素, 当所有群群体大小都具有相似的顺序时。 此外, 我们的理论为变化点天体标的趋同速度提供了强有力的保证。 数字研究展示了群点在广泛环境中的竞争性表现, 一个真实的数据实例证实了我们程序的实际效用。