The simultaneous estimation of many parameters based on data collected from corresponding studies is a key research problem that has received renewed attention in the high-dimensional setting. Many practical situations involve heterogeneous data where heterogeneity is captured by a nuisance parameter. Effectively pooling information across samples while correctly accounting for heterogeneity presents a significant challenge in large-scale estimation problems. We address this issue by introducing the ``Nonparametric Empirical Bayes Structural Tweedie" (NEST) estimator, which efficiently estimates the unknown effect sizes and properly adjusts for heterogeneity via a generalized version of Tweedie's formula. For the normal means problem, NEST simultaneously handles the two main selection biases introduced by heterogeneity: one, the selection bias in the mean, which cannot be effectively corrected without also correcting for, two, selection bias in the variance. We develop theory to show that NEST is asymptotically as good as the optimal Bayes rule that uniquely minimizes a weighted squared error loss. In our simulation studies NEST outperforms competing methods, with much efficiency gains in many settings. The proposed method is demonstrated on estimating the batting averages of baseball players and Sharpe ratios of mutual fund returns. Extensions to other members of the two-parameter exponential family are discussed.
翻译:根据相应研究所收集的数据对许多参数同时进行估计是一个关键的研究问题,在高维环境中重新引起注意。许多实际情况都涉及不同数据,其中以骚扰参数捕捉到异质性。有效收集各种样本中的信息,同时正确核算异质性,这是大规模估算问题的一个重大挑战。我们通过采用“非对称光学贝类结构网”估计仪(NEST)解决这个问题,该估计仪有效地估计了未知影响大小,并通过通用版的Tweedie公式对异质性作了适当调整。对于正常手段问题,NEST同时处理异质性所引入的两种主要选择偏差:一是平均值中的选择偏差,在不纠正的情况下无法有效纠正,二是差异中的选择偏差。我们开发了理论,以显示NEST与最佳的贝类规则一样好,这种最佳规则可以将加权的平价差损失降到最低。在我们模拟研究中,NEST比优于常规方法,在相互竞争的方法上,高效率的BABBBBB的双倍比例是模拟模型,在多个BBBBBBBBBBBBB中展示了双向BBBBBBBBB的双向BBB。我们展示。提议的方法。