Regression models with crossed random effect errors can be very expensive to compute. The cost of both generalized least squares and Gibbs sampling can easily grow as $N^{3/2}$ (or worse) for $N$ observations. Papaspiliopoulos et al. (2020) present a collapsed Gibbs sampler that costs $O(N)$, but under an extremely stringent sampling model. We propose a backfitting algorithm to compute a generalized least squares estimate and prove that it costs $O(N)$. A critical part of the proof is in ensuring that the number of iterations required is $O(1)$ which follows from keeping a certain matrix norm below $1-\delta$ for some $\delta>0$. Our conditions are greatly relaxed compared to those for the collapsed Gibbs sampler, though still strict. Empirically, the backfitting algorithm has a norm below $1-\delta$ under conditions that are less strict than those in our assumptions. We illustrate the new algorithm on a ratings data set from Stitch Fix.
翻译:具有跨越随机效应误差的回归模型可能非常昂贵。 普通最低方位和Gibbs抽样的成本可以很容易地以美元(或更差)的美元增长。 Papaspiliopoulos 等人( 2020年) 提出了一个崩溃的Gibs取样器, 成本为O( N) 美元, 但是在极其严格的抽样模型下。 我们提出一个回调算法, 来计算普遍最低方位的估算, 并证明它的成本为O( N)美元。 证据的一个重要部分是确保所需的迭代数为O(1)美元, 这是因为将某种矩阵规范保持在1\ delta美元以下( 或更差) 美元以下。 我们的条件与崩溃的Gibs采样器的条件相比,尽管仍然很严格, 但仍然非常宽松。 典型的反调算法在比我们假设的不那么严格的条件下, 低于1\ delta美元的标准。 我们用新的算法来说明Stitch Fix的评级数据设置的新算法。