分散导向网络的粗略随机图图模型 (A Sparse Random Graph Model for Sparse Directed Networks)

An increasingly urgent task in analysis of networks is to develop statistical models that include contextual information in the form of covariates while respecting degree heterogeneity and sparsity. In this paper, we propose a new parameter-sparse random graph model for density-sparse directed networks, with parameters to explicitly account for all these features. The resulting objective function of our model is akin to that of the high-dimensional logistic regression, with the key difference that the probabilities are allowed to go to zero at a certain rate to accommodate sparse networks. We show that under appropriate conditions, an estimator obtained by the familiar penalized likelihood with an $\ell_1$ penalty to achieve parameter sparsity can alleviate the curse of dimensionality, and crucially is selection and rate consistent. Interestingly, inference on the covariate parameter can be conducted straightforwardly after the model fitting, without the need of the kind of debiasing commonly employed in $\ell_1$ penalized likelihood estimation. Simulation and data analysis corroborate our theoretical findings. In developing our model, we provide the first result highlighting the fallacy of what we call data-selective inference, a common practice of artificially truncating the sample by throwing away nodes based on their connections, by examining the estimation bias in the Erd\"os-R\'enyi model theoretically and in the stochastic block model empirically.

翻译：分析网络的日益紧迫的任务是开发统计模型,以共变形式纳入背景信息,同时尊重程度异质性和广度。在本文中,我们为密度偏差的定向网络提出一个新的参数分析随机图表模型,并配有明确说明所有这些特征的参数。因此,我们模型的客观功能类似于高维后勤回归,关键区别是允许概率以一定速度降至零,以适应稀有网络。我们表明,在适当条件下,通过熟悉的受罚可能性获得的1美元罚款的估测器可以减轻参数偏移的诅咒,关键是选择和率的一致性。有趣的是,在模型调整之后,可以直接地对共变差参数进行推论,而不需要在1美元模型中通常使用的偏差度,以适应于零位的概率估计。模拟和数据分析证实了我们的理论结论。在开发模型时,我们提供了第一个结果,突出我们称之为数据偏差的偏差,即通过不通过常规的概率分析,将数据偏差的测算结果显示我们所谓的数据偏差。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日