Inspired by the recent achievements of machine learning in diverse domains, data-driven metamaterials design has emerged as a compelling paradigm that can unlock the potential of multiscale architectures. The model-centric research trend, however, lacks principled frameworks dedicated to data acquisition, whose quality propagates into the downstream tasks. Often built by naive space-filling design in shape descriptor space, metamaterial datasets suffer from property distributions that are either highly imbalanced or at odds with design tasks of interest. To this end, we present t-METASET: an active-learning-based data acquisition framework aiming to guide both diverse and task-aware data generation. Distinctly, we seek a solution to a commonplace yet frequently overlooked scenario at early stages of data-driven design of metamaterials: when a massive (~O(10^4 )) shape-only library has been prepared with no properties evaluated. The key idea is to harness a data-driven shape descriptor learned from generative models, fit a sparse regressor as a start-up agent, and leverage metrics related to diversity to drive data acquisition to areas that help designers fulfill design goals. We validate the proposed framework in three deployment cases, which encompass general use, task-specific use, and tailorable use. Two large-scale mechanical metamaterial datasets are used to demonstrate the efficacy. Applicable to general image-based design representations, t-METASET could boost future advancements in data-driven design.
翻译:数据驱动的元材料设计被最近在不同领域机器学习的成就所启发,成为一个令人信服的范例,能够释放多尺度结构的潜力。但是,以模型为中心的研究趋势缺乏专用于数据获取的原则框架,而数据获取的质量会传播到下游任务。通常由天真的空间填充设计以形状描述空间构建的元材料数据集会受到财产分布的影响,这些分布要么高度不平衡,要么与设计任务不符。为此,我们提出了基于数据的数据设计:一个基于积极学习的数据获取框架,旨在指导多样化和任务认知的数据生成。我们特别地寻求一种解决办法,在数据驱动的元材料设计设计的早期阶段,一个常见但经常被忽视的情景:当一个庞大的(~O(10+4))只显示形状的图书馆已经建立,而没有评估任何属性时,元数据数据集的分布就会受到严重影响。关键的想法是利用基于变现模型的以数据驱动的形状描述器,将稀释的回归器作为起始的代理,并利用与多样性有关的指标,将数据采集数据获取到设计师能够实现设计目标的领域中,使用大规模设计设计工具。