The goal of survey design is often to minimize the errors associated with inference: the total of bias and variance. Random surveys are common because they allow the use of theoretically unbiased estimators. In practice however, such design-based approaches are often unable to account for logistical or budgetary constraints. Thus, they may result in samples that are logistically inefficient, or infeasible to implement. Various balancing and optimal sampling techniques have been proposed to improve the statistical efficiency of such designs, but few models have attempted to explicitly incorporate logistical and financial constraints. We introduce a mixed integer linear program (MILP) for optimal sampling design, capable of capturing a variety of constraints and a wide class of Bayesian regression models. We demonstrate the use of our model on three spatial sampling problems of increasing complexity, including the real logistics of the US Forest Service Forest Inventory and Analysis survey of Tanana, Alaska. Our methodological contribution to survey design is significant because the proposed modeling framework makes it possible to generate high-quality sampling designs and inferences while satisfying practical constraints defined by the user. The technical novelty of the method is the explicit integration of Bayesian statistical models in combinatorial optimization. This integration might allow a paradigm shift in spatial sampling under constrained budgets or logistics.
翻译:调查设计的目标往往是为了尽量减少与推论有关的错误:偏差和差异的总和。随机调查是常见的,因为它们允许使用理论上没有偏见的估测器。但实际上,这种基于设计的方法往往无法说明后勤或预算限制的原因,因此,它们可能导致在后勤上效率低下或无法执行的样品;提出了各种平衡和最佳抽样技术,以提高这类设计的统计效率,但很少模型试图明确纳入后勤和财政限制。我们为最佳采样设计采用了混合整数线性方案(MILP),能够捕捉各种限制因素和广泛的巴耶西亚回归模型。我们展示了我们如何使用我们关于三个日益复杂的空间抽样问题的模式,包括美国林业处森林清单和分析调查在阿拉斯加塔纳的实际后勤情况。我们对于调查设计的方法贡献很大,因为拟议的模型框架能够产生高质量的采样设计和推断,同时满足用户确定的实际限制。这种方法的技术新颖之处是将巴伊西亚统计模型明确纳入组合式或空间优化预算。这种整合可能允许将贝氏统计模型明确纳入组合中。