A major goal of materials design is to find material structures with desired properties and in a second step to find a processing path to reach one of these structures. In this paper, we propose and investigate a deep reinforcement learning approach for the optimization of processing paths. The goal is to find optimal processing paths in the material structure space that lead to target-structures, which have been identified beforehand to result in desired material properties. As the relation between properties and structures is generally non-unique, typically a whole set of target-structures can be identified, that lead to desired properties. Our proposed method optimizes processing paths from a start structure to one of these equivalent target-structures. The algorithm learns to find near-optimal paths by interacting with the structure-generating process. It is guided by structure descriptors as process state features and a reward signal, which is formulated based on a distance function in the structure space. The model-free reinforcement learning algorithm learns through trial and error while interacting with the process and does not rely on a priori sampled processing data. We instantiate and evaluate the proposed methods by optimizing paths of a generic metal forming process.
翻译:材料设计的一个主要目标是寻找具有理想特性的材料结构,并在第二步寻找达到其中一种结构的加工路径。在本文件中,我们建议并调查一种深度强化学习方法,以优化加工路径。目标是在材料结构空间找到导致目标结构的最佳处理路径,这些空间事先已经确定,以产生预期物质属性。由于属性和结构之间的关系一般不独特,通常可以确定整套目标结构,从而导致想要的属性。我们提议的方法优化了从起始结构到这些同等目标结构之一的处理路径。算法通过与结构生成过程的相互作用,学会找到接近最佳的路径。它以结构描述器为指导,作为过程状态特征和奖励信号,根据结构空间的远程功能制定。无模型强化学习算法通过试验和错误学习,同时与进程互动,不依赖先前抽样的处理数据。我们通过优化一般金属形成过程的路径,对拟议方法进行即时和评估。