系统网络混乱,无法为后勤回归提供保障 (Sparse network asymptotics for logistic regression)

Consider a bipartite network where $N$ consumers choose to buy or not to buy $M$ different products. This paper considers the properties of the logistic regression of the $N\times M$ array of i-buys-j purchase decisions, $\left[Y_{ij}\right]_{1\leq i\leq N,1\leq j\leq M}$, onto known functions of consumer and product attributes under asymptotic sequences where (i) both $N$ and $M$ grow large and (ii) the average number of products purchased per consumer is finite in the limit. This latter assumption implies that the network of purchases is sparse: only a (very) small fraction of all possible purchases are actually made (concordant with many real-world settings). Under sparse network asymptotics, the first and last terms in an extended Hoeffding-type variance decomposition of the score of the logit composite log-likelihood are of equal order. In contrast, under dense network asymptotics, the last term is asymptotically negligible. Asymptotic normality of the logistic regression coefficients is shown using a martingale central limit theorem (CLT) for triangular arrays. Unlike in the dense case, the normality result derived here also holds under degeneracy of the network graphon. Relatedly, when there happens to be no dyadic dependence in the dataset in hand, it specializes to recently derived results on the behavior of logistic regression with rare events and iid data. Sparse network asymptotics may lead to better inference in practice since they suggest variance estimators which (i) incorporate additional sources of sampling variation and (ii) are valid under varying degrees of dyadic dependence.

翻译：考虑一个双方网络, 其中消费者选择购买或不购买美元美元的不同产品。本文将考虑 $N$选择购买或不购买美元不同的双方网络。本文将 $N_time M$ M$ 购买 i- buy- j 购买决定的 i- buy- 购买系列的 $ left [Y ⁇ ij ⁇ right]\\\\ leq i\leq i\leq N, 1\leq j\leq M} \\\ 美元, 以亚美利坚和产品属性的已知函数在亚美利坚合一对等序列下, 并且 (一) 美元和美美利和美利的美元增长, 每个消费者购买的产品的平均数量在限值中是有限的。后后假设购买网络的, 其购买网络的正常值只有很少一部分。在内, 直立性数据的直立性数据显示的的直立性的度度度度度。在直方直方直方数据的直方直方的的直方数据显示直方直方的。