转向亚线性时间中用于计算和抽样任意摩蒂夫的分解- 最佳定量算法 (Towards a Decomposition-Optimal Algorithm for Counting and Sampling Arbitrary Motifs in Sublinear Time)

We consider the problem of sampling and approximately counting an arbitrary given motif $H$ in a graph $G$, where access to $G$ is given via queries: degree, neighbor, and pair, as well as uniform edge sample queries. Previous algorithms for these tasks were based on a decomposition of $H$ into a collection of odd cycles and stars, denoted $\mathcal{D}^*(H)=\{O_{k_1}, \ldots, O_{k_q}, S_{p_1}, \ldots, S_{p_\ell}\}$. These algorithms were shown to be optimal for the case where $H$ is a clique or an odd-length cycle, but no other lower bounds were known. We present a new algorithm for sampling and approximately counting arbitrary motifs which, up to $\textrm{poly}(\log n)$ factors, is always at least as good as previous results, and for most graphs $G$ is strictly better. The main ingredient leading to this improvement is an improved uniform algorithm for sampling stars, which might be of independent interest, as it allows to sample vertices according to the $p$-th moment of the degree distribution. Finally, we prove that this algorithm is \emph{decomposition-optimal} for decompositions that contain at least one odd cycle. These are the first lower bounds for motifs $H$ with a nontrivial decomposition, i.e., motifs that have more than a single component in their decomposition.

翻译：我们考虑的是抽样问题,并大致计算一个任意给定的美元元值,在一张G$的图表中,通过以下查询访问$G$:度、邻里和对等,以及统一的边缘抽样查询。这些任务以前的算法基于将$H美元分解成奇数周期和恒星的集合,用$mathcal{D}(H) ⁇ ⁇ {O ⁇ {O ⁇ k_q}、O ⁇ k_q}、S ⁇ p_ri}、aldots,S ⁇ p ⁇ ell}。这些算法被显示是最佳的,因为当美元是一个奇数周期或奇数周期时,但这些任务以前的算法基于将$H美元分解分解成奇数的集合。我们提出一个新的算法,大约算出任意的motif,直到$textr{H值的计算结果都比以前好得多,对于大多数图表来说,美元正数是绝对好的。导致这一改进的主要要素是最低的美元分解法值,最终可以让一个样本的分解程度。