常规路径查询光谱值的复杂性 (The Complexity of the Shapley Value for Regular Path Queries)

A path query extracts vertex tuples from a labeled graph, based on the words that are formed by the paths connecting the vertices. We study the computational complexity of measuring the contribution of edges and vertices to an answer to a path query, focusing on the class of conjunctive regular path queries. To measure this contribution, we adopt the traditional Shapley value from cooperative game theory. This value has been recently proposed and studied in the context of relational database queries and has uses in a plethora of other domains. We first study the contribution of edges and show that the exact Shapley value is almost always hard to compute. Specifically, it is #P-hard to calculate the contribution of an edge whenever at least one (non-redundant) conjunct allows for a word of length three or more. In the case of regular path queries (i.e., no conjunction), the problem is tractable if the query has only words of length at most two; hence, this property fully characterizes the tractability of the problem. On the other hand, if we allow for an approximation error, then it is straightforward to obtain an efficient scheme (FPRAS) for an additive approximation. Yet, a multiplicative approximation is harder to obtain. We establish that in the case of conjunctive regular path queries, a multiplicative approximation of the Shapley value of an edge can be computed in polynomial time if and only if all query atoms are finite languages (assuming non-redundancy and conventional complexity limitations). We also study the analogous situation where we wish to determine the contribution of a vertex, rather than an edge, and establish complexity results of similar nature.

翻译：路径查询从标签图中提取了来自标签图的顶点图。以连接顶端的路径所形成的单词为基础。我们研究测量边缘和顶点对路径查询的答案的贡献的计算复杂性, 重点是相交常规路径查询的类别。为了测量这一贡献, 我们采用了合作游戏理论中传统的沙普利值。这个数值是最近根据关系数据库查询而提议和研究的, 并用于过多的其他领域。我们首先研究边缘的贡献, 并显示精确的变色值几乎总是难以计算。具体地说, 当至少一个( 非编辑的) 相交连接路径允许一个长度为三或以上的词时, 计算边缘贡献的计算难度很难。在常规查询( 即, 没有连接) 常规查询中, 如果查询只有两个词的长度, 并且只在其它域中使用。因此, 此属性能充分描述问题的可调控性。另一方面, 如果我们允许一个更接近的变异性变异性变异的变异性变异性选择方案, 则直径直径直地计算出一个更精确的变的变的变的变更精确方案。