This paper presents a subsampling-task paradigm for data-driven task-specific experiment design (ED) and a novel method in populationwide supervised feature selection (FS). Optimal ED, the choice of sampling points under constraints of limited acquisition-time, arises in a wide variety of scientific and engineering contexts. However the continuous optimization used in classical approaches depend on a-priori parameter choices and challenging non-convex optimization landscapes. This paper proposes to replace this strategy with a subsampling-task paradigm, analogous to populationwide supervised FS. In particular, we introduce JOFSTO, which performs JOint Feature Selection and Task Optimization. JOFSTO jointly optimizes two coupled networks: one for feature scoring, which provides the ED, the other for execution of a downstream task or process. Unlike most FS problems, e.g. selecting protein expressions for classification, ED problems typically select from highly correlated globally informative candidates rather than seeking a small number of highly informative features among many uninformative features. JOFSTO's construction efficiently identifies potentially correlated, but effective subsets and returns a trained task network. We demonstrate the approach using parameter estimation and mapping problems in quantitative MRI, where economical ED is crucial for clinical application. Results from simulations and empirical data show the subsampling-task paradigm strongly outperforms classical ED, and within our paradigm, JOFSTO outperforms state-of-the-art supervised FS techniques. JOFSTO extends immediately to wider image-based ED problems and other scenarios where the design must be specified globally across large numbers of acquisitions. Code will be released.
翻译:本文为数据驱动任务特定实验设计提供了一个次抽样任务模式(ED)和全人口监督特征选择的新颖方法。 最佳ED在有限的获取时间限制下选择抽样点,产生于广泛的科学和工程背景。 但是,古典方法中的持续优化取决于优先参数选择和具有挑战性的非混凝土优化景观。本文建议用一个与全人口监督的FS相似的次级抽样任务模式来取代这一战略。特别是,我们引入了能进行更广义的JOFSTO,该模式可以进行更广义的 JOFSTO选择和任务优化。 最佳EDTO联合优化了两个组合网络:一个功能评分,提供ED,另一个用于执行下游任务或进程。 与大多数FS问题不同的是,例如选择蛋白质表示分类,EDG问题通常从高度相关的全球信息候选人中挑选,而不是在许多非信息化特征中寻找少量高度信息化的特点。 联合FSTO的施工将有效地确定潜在的关联,但有效的子集和返回一个经过严格培训的任务网络。 联合FSTO联合优化两个网络优化地优化地优化地优化地优化了两种网络的网络的网络, 我们用模拟模型展示了全球结果展示了我们的数据模型的模型的模型和模型的模型分析, 的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型的模型和模型的模型的模型的模型的模型的模型的模型的计算。