The complexity underlying real-world systems implies that standard statistical hypothesis testing methods may not be adequate for these peculiar applications. Specifically, we show that the likelihood-ratio test's null-distribution needs to be modified to accommodate the complexity found in multi-edge network data. When working with independent observations, the p-values of likelihood-ratio tests are approximated using a $\chi^2$ distribution. However, such an approximation should not be used when dealing with multi-edge network data. This type of data is characterized by multiple correlations and competitions that make the standard approximation unsuitable. We provide a solution to the problem by providing a better approximation of the likelihood-ratio test null-distribution through a Beta distribution. Finally, we empirically show that even for a small multi-edge network, the standard $\chi^2$ approximation provides erroneous results, while the proposed Beta approximation yields the correct p-value estimation.
翻译:现实世界系统的复杂性意味着标准统计假设测试方法可能不足以满足这些特殊应用。 具体地说, 我们表明, 概率- 比率测试的无效分布需要修改, 以适应多端网络数据中发现的复杂情况。 在进行独立观测时, 概率- 比率测试的p值大约使用美元=2美元分布值。 但是, 在处理多端网络数据时, 不应该使用这种近似值。 这种类型的数据具有多重关联和竞争的特点, 使得标准近似不适宜。 我们通过Beta分布提供更接近于概率- 比率测试的无效分布, 从而提供了解决问题的办法。 最后, 我们从经验上表明, 即使对小型多端网络来说, 标准 $\ chi=2美元近似值也会产生错误的结果, 而拟议的贝塔近值则得出正确的p值估计值。