Spatial clustering detection methods are widely used in many fields of research including epidemiology, ecology, biology, physics, and sociology. In these fields, areal data is often of interest; such data may result from spatial aggregation (e.g. the number disease cases in a county) or may be inherent attributes of the areal unit as a whole (e.g. the habitat suitability of conserved land parcel). This study aims to assess the performance of two spatial clustering detection methods on areal data: the average nearest neighbor (ANN) ratio and Ripley's K function. These methods are designed for point process data, but their ease of implementation in GIS software and the lack of analogous methods for areal data have contributed to their use for areal data. Despite the popularity of applying these methods to areal data, little research has explored their properties in the areal data context. In this paper we conduct a simulation study to evaluate the performance of each method for areal data under different types of spatial dependence and different areal structures. The results shows that the empirical type I error rates are inflated for the ANN ratio and Ripley's K function, rendering the methods unreliable for areal data.
翻译:在许多研究领域,包括流行病学、生态学、生物学、物理学和社会学领域,广泛使用空间集群探测方法;在这些领域,区域数据经常引起兴趣;这些数据可能来自空间汇总(例如,一个县的病例数),或可能是整个区域单位的固有属性(例如,受保护的地块的生境适宜性);这项研究旨在评估两种空间集群探测方法对地块数据的性能:相邻(ANN)平均比率和Ripley的K函数。这些方法是为点处理数据设计的,但它们容易在地理信息系统软件中实施,而且缺乏类似的非数据方法,因此有助于这些数据用于非数据。尽管采用这些方法对地块数据很受欢迎,但几乎没有研究在区域数据范围内探索这些数据的特性。在本文件中,我们进行了模拟研究,以评价每种方法对地貌依赖性和不同结构下的数据的性能。结果显示,经验型I错误率因ANN比率和Ripley的K函数而有所夸大,使方法不可靠。