Supersaturated designs, in which the number of factors exceeds the number of runs, are often constructed under a heuristic criterion that measures a design's proximity to an unattainable orthogonal design. Such a criterion does not directly measure a design's quality in terms of screening. To address this disconnect, we develop optimality criteria to maximize the lasso's sign recovery probability. The criteria have varying amounts of prior knowledge about the model's parameters. We show that an orthogonal design is an ideal structure when the signs of the active factors are unknown. When the signs are assumed known, we show that a design whose columns exhibit small, positive correlations are ideal. Such designs are sought after by the Var(s+)-criterion. These conclusions are based on a continuous optimization framework, which rigorously justifies the use of established heuristic criteria. From this justification, we propose a computationally-efficient design search algorithm that filters through optimal designs under different heuristic criteria to select the one that maximizes the sign recovery probability under the lasso.
翻译:翻译后的摘要:
超饱和设计中,因子数量超过运行次数,常常被构建为一个启发式标准下的近似无法达成的正交设计。这样的标准并没有直接衡量一个设计在筛选方面的质量。为了解决这种脱节,我们开发了优化标准,最大化 Lasso 的符号恢复概率。这些标准对模型参数有不同程度的先验知识。我们表明,当活动因子的符号未知时,正交设计是理想的结构。当符号已知时,我们表明,具有小的正相关性的列的设计是理想的。这样的设计是 Var(s+)-criterion 寻求的。这些结论基于连续优化框架,严格证明了使用已建立的启发式标准的理由。基于这个证明,我们提出了一个计算高效的设计搜索算法,该算法通过不同的启发式标准过滤出最佳的设计,以最大化 Lasso 中的符号恢复概率。