We consider the problem of providing valid inference for a selected parameter in a sparse regression setting. It is well known that classical regression tools can be unreliable in this context due to the bias generated in the selection step. Many approaches have been proposed in recent years to ensure inferential validity. Here, we consider a simple alternative to data splitting based on randomising the response vector, which allows for higher selection and inferential power than the former and is applicable with an arbitrary selection rule. We provide a theoretical and empirical comparison of both methods and derive a Central Limit Theorem for the randomisation approach. Our investigations show that the gain in power can be substantial.
翻译:我们考虑了在微弱的回归环境中为选定的参数提供有效推断的问题,众所周知,由于选择步骤中产生的偏差,古典回归工具在这方面可能不可靠,近年来提出了许多办法,以确保推断的有效性。在这里,我们考虑一种简单的替代数据分割办法,即根据随机测算反应矢量进行数据分割,允许比前者有更高的选择和推断能力,并适用任意选择规则。我们对两种方法进行理论和经验上的比较,并得出随机测算方法的中央限制理论。我们的调查显示,权力的增益是巨大的。