The modifiable areal unit problem in geography or the change-of-support (COS) problem in statistics demonstrates that the interpretation of spatial (or spatio-temporal) data analysis is affected by the choice of resolutions or geographical units used in the study. The ecological fallacy is one famous example of this phenomenon. Here we investigate the ecological fallacy associated with the COS problem for multivariate spatial data with the goal of providing a data-driven discretization criterion for the domain of interest that minimizes aggregation errors. The discretization is based on a novel multiscale metric, called the Multivariate Criterion for Aggregation Error (MVCAGE). Such multi-scale representations of an underlying multivariate process are often formulated in terms of basis expansions. We show that a particularly useful basis expansion in this context is the multivariate Karhunen-Lo`eve expansion (MKLE). We use the MKLE to build the MVCAGE loss function and use it within the framework of spatial clustering algorithms to perform optimal spatial aggregation. We demonstrate the effectiveness of our approach through simulation and through regionalization of county-level income and hospital quality data over the United States and prediction of ocean color in the coastal Gulf of Alaska.
翻译:暂无翻译