Network alignment aims to uncover topologically similar regions in the protein-protein interaction (PPI) networks of two or more species under the assumption that topologically similar regions perform similar functions. Although there exist a plethora of both network alignment algorithms and measures of topological similarity, currently no "gold standard" exists for evaluating how well either is able to uncover functionally similar regions. Here we propose a formal, mathematically and statistically rigorous method for evaluating the statistical significance of shared GO terms in a global, 1-to-1 alignment between two PPI networks. We use combinatorics to precisely count the number of possible network alignments in which $k$ proteins share a particular GO term. When divided by the number of all possible network alignments, this provides an explicit, exact $p$-value for a network alignment with respect to a particular GO term.
翻译:网络对齐的目的是在两个或两个以上物种的蛋白质-蛋白相互作用网络中发现在表层上相似的区域,假设在表层相似的区域履行类似的功能。虽然网络对齐算法和表层相似度测量方法都存在过多的网络对齐算法和表层相似度测量法,但目前还没有“黄金标准”来评价在功能上相似的区域中,两者是否都能够很好地发现。在这里,我们提出了一个正式、数学和统计上严格的方法,用以评价两个PPPI网络之间全球一比一对齐的GO术语共享GO术语的统计意义。我们使用复选法来精确计算美元蛋白在特定GO术语中共享的可能的网络对齐数。在除所有可能的网络对齐数之外,这为特定GO术语的网络对齐提供了明确、准确的美元价值。