Building an AI agent that can design on its own has been a goal since the 1980s. Recently, deep learning has shown the ability to learn from large-scale data, enabling significant advances in data-driven design. However, learning over prior data limits us only to solve problems that have been solved before and biases data-driven learning towards existing solutions. The ultimate goal for a design agent is the ability to learn generalizable design behavior in a problem space without having seen it before. We introduce a self-learning agent framework in this work that achieves this goal. This framework integrates a deep policy network with a novel tree search algorithm, where the tree search explores the problem space, and the deep policy network leverages self-generated experience to guide the search further. This framework first demonstrates an ability to discover high-performing generative strategies without any prior data, and second, it illustrates a zero-shot generalization of generative strategies across various unseen boundary conditions. This work evaluates the effectiveness and versatility of the framework by solving multiple versions of two engineering design problems without retraining. Overall, this paper presents a methodology to self-learn high-performing and generalizable problem-solving behavior in an arbitrary problem space, circumventing the needs for expert data, existing solutions, and problem-specific learning.
翻译:自1980年代以来,自1980年代以来,一个能够自行设计AI的代理机构就一直是一个目标。最近,深层次的学习表明,有能力从大规模数据中学习,从而在数据驱动的设计方面取得重大进展。然而,对先前的数据的学习使我们只能解决以前已经解决的问题,而将数据驱动的学习偏向于现有解决办法。设计代理机构的最终目标是能够在一个问题空间中学习可通用的设计行为,而无需事先看到这种行为。我们在这项工作中引入了一个自我学习代理框架,从而实现这一目标。这个框架将一个深层次的政策网络与一个新的树类搜索算法结合起来,在这个算法中,树类搜索探索问题空间,而深层的政策网络则利用自己产生的经验来指导进一步的研究。这个框架首先展示了在没有任何先前数据的情况下发现高效的基因化战略的能力,其次,它展示了在各种不可见的边界条件下,对基因化战略进行零光全的概括。本工作通过解决两种工程设计问题而无需再培训,评估框架的有效性和多功能性。总体而言,这份文件提出了一种方法,可以自我学习高性、可实现性和可实现性地解决问题的方法。