The Shapley value (SV) has emerged as a promising method for data valuation. However, computing or estimating the SV is often computationally expensive. To overcome this challenge, Jia et al. (2019) propose an advanced SV estimation algorithm called ``Group Testing-based SV estimator'' which achieves favorable asymptotic sample complexity. In this technical note, we present several improvements in the analysis and design choices of this SV estimator. Moreover, we point out that the Group Testing-based SV estimator does not fully reuse the collected samples. Our analysis and insights contribute to a better understanding of the challenges in developing efficient SV estimation algorithms for data valuation.
翻译:Shapley值(SV)已成为数据估值的一个很有希望的方法,然而,计算或估算SV往往在计算上很昂贵。为了克服这一挑战,Jia等人(2019年)提议采用先进的SV估计算法,称为“基于集体测试的SV估计算法 ” ( Group测试基础SV估计算法 ), 实现有利的无症状抽样复杂性。在本技术说明中,我们介绍了SV估计算法在分析和设计选择方面的一些改进。此外,我们指出,基于集团测试的SV估计算法没有完全再利用所收集的样品。我们的分析和洞察力有助于更好地了解在为数据估值制定高效的SV估计算法方面的挑战。