We study the problem of identifying the set of \emph{active} variables, termed in the literature as \emph{variable selection} or \emph{multiple hypothesis testing}, depending on the pursued criteria. For a general \emph{robust setting} of non-normal, possibly dependent observations and a generalized notion of \emph{active set}, we propose a procedure that is used simultaneously for the both tasks, variable selection and multiple testing. The procedure is based on the \emph{risk hull minimization} method, but can also be obtained as a result of an empirical Bayes approach or a penalization strategy. We address its quality via various criteria: the Hamming risk, FDR, FPR, FWER, NDR, FNR,and various \emph{multiple testing risks}, e.g., MTR=FDR+NDR; and discuss a weak optimality of our results. Finally, we introduce and study, for the first time, the \emph{uncertainty quantification problem} in the variable selection and multiple testing context in our robust setting.
翻译:我们根据所追求的标准,研究如何确定文献中称为 emph{ 可变选择} 或 emph{ 多重假设测试} 的一组变量的问题。 对于非正常的、可能依赖的观察和通用概念的不常规的、可能依赖的观察和多元的测试,我们提议一个同时用于两个任务、变量选择和多重测试的程序。该程序基于\ emph{ 风险船体最小化} 方法,但也可以通过经验性海湾方法或惩罚性战略获得。我们通过各种标准来处理其质量: 模拟风险、 FDR、 FPR、 FWER、 NDR、 FNRR, 和各种/emph{ 多重测试风险的通用概念,例如中期审查= FDR+NDR; 并讨论我们结果的微弱最佳性。 最后,我们第一次在可变的选择和多重测试背景下,介绍和研究 。