We initiate a study of a new model of property testing that is a hybrid of testing properties of distributions and testing properties of strings. Specifically, the new model refers to testing properties of distributions, but these are distributions over huge objects (i.e., very long strings). Accordingly, the model accounts for the total number of local probes into these objects (resp., queries to the strings) as well as for the distance between objects (resp., strings), and the distance between distributions is defined as the earth mover's distance with respect to the relative Hamming distance between strings. We study the query complexity of testing in this new model, focusing on three directions. First, we try to relate the query complexity of testing properties in the new model to the sample complexity of testing these properties in the standard distribution testing model. Second, we consider the complexity of testing properties that arise naturally in the new model (e.g., distributions that capture random variations of fixed strings). Third, we consider the complexity of testing properties that were extensively studied in the standard distribution testing model: Two such cases are uniform distributions and pairs of identical distributions.
翻译:我们开始研究一个新的属性测试模型,该模型是分布特性测试和字符串测试特性的混合体。具体地说,新模型是指分布特性的测试,但这些是巨大物体(即非常长的字符串)的分布特性。因此,模型计算了在这些物体中进行局部探测的总数(重复、查询字符串),以及物体之间的距离(重复、字符串),分布之间的距离被定义为地球移动者相对于字符串之间的相对宽长距离的距离。我们研究了这一新模型中测试的查询复杂性,重点是三个方向。首先,我们试图将新模型中测试特性的查询复杂性与标准分布测试模型中测试这些特性的样本复杂性联系起来。第二,我们考虑了在新模型中自然产生的测试特性的复杂性(例如显示固定字符串随机变化的分布)。第三,我们考虑了标准分布测试模型中广泛研究的测试特性的复杂性:两个此类案例是相同分布的统一分布和配对。