The $k$-tensor Ising model is an exponential family on a $p$-dimensional binary hypercube for modeling dependent binary data, where the sufficient statistic consists of all $k$-fold products of the observations, and the parameter is an unknown $k$-fold tensor, designed to capture higher-order interactions between the binary variables. In this paper, we describe an approach based on a penalization technique that helps us recover the signed support of the tensor parameter with high probability, assuming that no entry of the true tensor is too close to zero. The method is based on an $\ell_1$-regularized node-wise logistic regression, that recovers the signed neighborhood of each node with high probability. Our analysis is carried out in the high-dimensional regime, that allows the dimension $p$ of the Ising model, as well as the interaction factor $k$ to potentially grow to $\infty$ with the sample size $n$. We show that if the minimum interaction strength is not too small, then consistent recovery of the entire signed support is possible if one takes $n = \Omega((k!)^8 d^3 \log \binom{p-1}{k-1})$ samples, where $d$ denotes the maximum degree of the hypernetwork in question. Our results are validated in two simulation settings, and applied on a real neurobiological dataset consisting of multi-array electro-physiological recordings from the mouse visual cortex, to model higher-order interactions between the brain regions.
翻译:$k$-阶张量伊辛模型是一个用于建模相关二进制数据的指数族,其统计量由所有观测值的$k$阶积组成,参数是一个未知的$k$阶张量,旨在捕捉二进制变量之间的高阶交互作用。在本文中,我们描述了一种基于惩罚技术的方法,该方法可以在不为真实张量的任何条目过于接近(近似于)零的情况下,以高概率恢复张量参数的符号支持。该方法基于$\ell_1$正则化节点-wise logistic回归,以高概率恢复每个节点的带符号邻域。我们的分析是在高维情境下进行的,该情境允许伊辛模型的维度$p$,以及交互因子$k$与样本量$n$一起增长,最终趋于无穷大。我们表明,如果最小交互强度不是太小,则在采取$n = \Omega((k!)^8 d^3 \log \binom{p-1}{k-1})$ 个样本时,可以一致地恢复整个符号支持区域,其中$d$表示所涉及超网络的最大度数。我们在两个模拟设置中验证了我们的结果,并应用于来自小鼠视觉皮层的多阵列电生理记录的真实神经生物学数据集,以建模大脑区域之间的高阶交互作用。