This paper presents a subsampling-task paradigm for data-driven task-specific experiment design (ED) and a novel method in populationwide supervised feature selection (FS). Optimal ED, the choice of sampling points under constraints of limited acquisition-time, arises in a wide variety of scientific and engineering contexts. However the continuous optimization used in classical approaches depend on a-priori parameter choices and challenging non-convex optimization landscapes. This paper proposes to replace this strategy with a subsampling-task paradigm, analogous to populationwide supervised FS. In particular, we introduce JOFSTO, which performs JOint Feature Selection and Task Optimization. JOFSTO jointly optimizes two coupled networks: one for feature scoring, which provides the ED, the other for execution of a downstream task or process. Unlike most FS problems, e.g. selecting protein expressions for classification, ED problems typically select from highly correlated globally informative candidates rather than seeking a small number of highly informative features among many uninformative features. JOFSTO's construction efficiently identifies potentially correlated, but effective subsets and returns a trained task network. We demonstrate the approach using parameter estimation and mapping problems in clinically-relevant applications in quantitative MRI and in hyperspectral imaging. Results from simulations and empirical data show the subsampling-task paradigm strongly outperforms classical ED, and within our paradigm, JOFSTO outperforms state-of-the-art supervised FS techniques. JOFSTO extends immediately to wider image-based ED problems and other scenarios where the design must be specified globally across large numbers of acquisitions. Code will be released.
翻译:本文为数据驱动任务特定实验设计提供了一个次抽样任务模式(ED)和全人口监督特征选择的新颖方法。 最佳ED在有限的获取时间限制下选择抽样点,产生于广泛的科学和工程背景。 但是,古典方法中的持续优化取决于优先参数选择和具有挑战性的非混凝土优化景观。本文件建议用一个与全人口监督的FS相似的次级抽样任务模式来取代这一战略。特别是,我们引入了具有更广泛监督特征的JOFSTO,该模式可以进行更广义的功能选择和任务优化。联合FSTO联合优化了两个组合网络:一个功能评分,提供ED,另一个用于执行下游任务或进程。与大多数FS问题不同的是,例如选择蛋白质表达方式进行分类,而ED问题通常从高度相关的全球候选人中挑选一些高度信息化的特征,而不是在许多非信息化特征中寻找。 联合FSTO的施工高效地确定了潜在的关联性、但有效的子集和返回经过培训的模型化任务网络。 我们展示了在高比级模型模型的模型中,在高层次的IMFS- IMFS- IMFS- 的模型中,将展示结果中,将展示和矩阵的实地评估和实地分析中,将展示结果分析结果分析结果分析结果分析结果分析结果分析结果分析结果分析结果分析结果和模型分析结果分析结果分析。