As more and more AI agents are used in practice, it is time to think about how to make these agents fully autonomous so that they can learn by themselves in a self-motivated and self-supervised manner rather than being retrained periodically on the initiation of human engineers using expanded training data. As the real-world is an open environment with unknowns or novelties, detecting novelties or unknowns, gathering ground-truth training data, and incrementally learning the unknowns make the agent more and more knowledgeable and powerful over time. The key challenge is how to automate the process so that it is carried out on the agent's own initiative and through its own interactions with humans and the environment. Since an AI agent usually has a performance task, characterizing each novelty becomes necessary so that the agent can formulate an appropriate response to adapt its behavior to cope with the novelty and to learn from it to improve its future responses and task performance. This paper proposes a theoretic framework for this learning paradigm to promote the research of building self-initiated open world learning agents.
翻译:由于在实践中越来越多地使用AI代理物,现在应该考虑如何使这些代理物完全自主,以便他们能够以自我激励和自我监督的方式自己学习,而不是在使用扩大的培训数据开始使用人类工程师时定期接受再培训。由于现实世界是一个开放的环境,有未知或新颖的事物,发现新奇或新奇事物,收集地面真相培训数据,并逐步了解未知物,随着时间的推移使该代理物越来越了解和强大。关键的挑战是如何使该过程自动化,以便由该代理物自己主动和通过它自己与人类和环境的互动来进行。由于AI代理物通常有一项业绩任务,因此每个新事物的特征都变得有必要,以便该代理物能够制定适当的对策,调整其行为,以适应新事物,并从中吸取教训,改进未来的反应和任务表现。本文件提出了这一学习模式的理论框架,以促进对建立自发开放世界学习物的研究。