The past decade has witnessed a surge of endeavors in statistical inference for high-dimensional sparse regression, particularly via de-biasing or relaxed orthogonalization. Nevertheless, these techniques typically require a more stringent sparsity condition than needed for estimation consistency, which seriously limits their practical applicability. To alleviate such constraint, we propose to exploit the identifiable features to residualize the design matrix before performing debiasing-based inference over the parameters of interest. This leads to a hybrid orthogonalization (HOT) technique that performs strict orthogonalization against the identifiable features but relaxed orthogonalization against the others. Under an approximately sparse model with a mixture of identifiable and unidentifiable signals, we establish the asymptotic normality of the HOT test statistic while accommodating as many identifiable signals as consistent estimation allows. The efficacy of the proposed test is also demonstrated through simulation and analysis of a stock market dataset.
翻译:过去十年来,在高维稀薄回归的统计推论方面,特别是通过去偏向性或放松正向性,出现了一股巨大的努力,然而,这些技术通常需要比估计一致性更严格的宽度条件,这严重限制了其实际适用性。为了减轻这种限制,我们提议利用可识别的特征来保留设计矩阵,然后对利益参数进行基于偏向性的推论。这导致一种混合或分解(HOT)技术,对可识别特征进行严格的正方位化,但对其他特征则进行宽松或正向化。在一个几乎稀少的模型中,有可识别和不可识别的信号混合,我们建立了HOT测试统计的无损常态性,同时在一致估计所允许的范围内容纳许多可识别的信号。通过对股票市场数据集进行模拟和分析,也证明了拟议测试的有效性。