Given a hypergraph, influence maximization (IM) is to discover a seed set containing $k$ vertices that have the maximal influence. Although the existing vertex-based IM algorithms perform better than the hyperedge-based algorithms by generating random reverse researchable (RR) sets, they are inefficient because (i) they ignore important structural information associated with hyperedges and thus obtain inferior results, (ii) the frequently-used sampling methods for generating RR sets have low efficiency because of a large number of required samplings along with high sampling variances, and (iii) the vertex-based IM algorithms have large overheads in terms of running time and memory costs. To overcome these shortcomings, this paper proposes a novel approach, called \emph{HyperIM}. The key idea behind \emph{HyperIM} is to differentiate structural information of vertices for developing stratified sampling combined with highly-efficient strategies to generate the RR sets. With theoretical guarantees, \emph{HyperIM} is able to accelerate the influence spread, improve the sampling efficiency, and cut down the expected running time. To further reduce the running time and memory costs, we optimize \emph{HyperIM} by inferring the bound of the required number of RR sets in conjunction with stratified sampling. Experimental results on real-world hypergraphs show that \emph{HyperIM} is able to reduce the number of required RR sets and running time by orders of magnitude while increasing the influence spread by up to $2.73X$ on average, compared to the state-of-the-art IM algorithms.
翻译:暂无翻译