For statistical analysis of network data, the $\beta$-model has emerged as a useful tool, thanks to its flexibility in incorporating nodewise heterogeneity and theoretical tractability. To generalize the $\beta$-model, this paper proposes the Sparse $\beta$-Regression Model (S$\beta$RM) that unites two research themes developed recently in modelling homophily and sparsity. In particular, we employ differential heterogeneity that assigns weights only to important nodes and propose penalized likelihood with an $\ell_1$ penalty for parameter estimation. While our estimation method is closely related to the LASSO method for logistic regression, we develop new theory emphasizing the use of our model for dealing with a parameter regime that can handle sparse networks usually seen in practice. More interestingly, the resulting inference on the homophily parameter demands no debiasing normally employed in LASSO type estimation. We provide extensive simulation and data analysis to illustrate the use of the model. As a special case of our model, we extend the Erd\H{o}s-R\'{e}nyi model by including covariates and develop the associated statistical inference for sparse networks, which may be of independent interest.
翻译:暂无翻译