Conditional estimation given specific covariate values (i.e., local conditional estimation or functional estimation) is ubiquitously useful with applications in engineering, social and natural sciences. Existing data-driven non-parametric estimators mostly focus on structured homogeneous data (e.g., weakly independent and stationary data), thus they are sensitive to adversarial noise and may perform poorly under a low sample size. To alleviate these issues, we propose a new distributionally robust estimator that generates non-parametric local estimates by minimizing the worst-case conditional expected loss over all adversarial distributions in a Wasserstein ambiguity set. We show that despite being generally intractable, the local estimator can be efficiently found via convex optimization under broadly applicable settings, and it is robust to the corruption and heterogeneity of the data. Experiments with synthetic and MNIST datasets show the competitive performance of this new class of estimators.
翻译:在工程、社会和自然科学的应用中,对特定共变值(即当地有条件估计或功能估计)的有条件估计无处不在,对工程、社会和自然科学的应用无所不在。现有的数据驱动的非参数性估计主要侧重于结构一致的数据(例如,薄弱的独立和静止数据),因此它们敏感于对抗性噪音,在低抽样规模下可能表现不佳。为了缓解这些问题,我们提议一个新的分布稳健的局部估计值,通过尽量减少瓦塞斯坦语一组模糊性中所有对抗性分布中最坏的有条件损失,产生非参数性当地估计值。我们表明,尽管一般比较棘手,但通过在广泛适用的环境下的凝固优化,可以高效率地找到当地估计值,而且它对数据的腐败和异质性很强。对合成和MNIST数据集的实验显示了这一新类别的估计值的竞争性表现。