For the data segmentation problem in high-dimensional linear regression settings, a commonly made assumption is that the regression parameters are segment-wise sparse, which enables many existing methods to estimate the parameters locally via $\ell_1$-regularised maximum likelihood-type estimation and contrast them for change point detection. Contrary to the common belief, we show that the sparsity of neither regression parameters nor their differences, a.k.a.\ differential parameters, is necessary for achieving the consistency in multiple change point detection. In fact, both statistically and computationally, better efficiency is attained by a simple strategy that scans for large discrepancies in local covariance between the regressors and the response. We go a step further and propose a suite of tools for directly inferring about the differential parameters post-segmentation, which are applicable even when the regression parameters themselves are non-sparse. Theoretical investigations are conducted under general conditions permitting non-Gaussianity, temporal dependence and ultra-high dimensionality. Numerical experiments demonstrate the competitiveness of the proposed methodologies.
翻译:暂无翻译