One of the inherent limitations of current AI systems, stemming from the passive learning mechanisms (e.g., supervised learning), is that they perform well on labeled datasets but cannot deduce knowledge on their own. To tackle this problem, we derive inspiration from a highly intentional learning system via action: the toddler. Inspired by the toddler's learning procedure, we design an interactive agent that can learn and store task-agnostic visual representation while exploring and interacting with objects in the virtual environment. Experimental results show that such obtained representation was expandable to various vision tasks such as image classification, object localization, and distance estimation tasks. In specific, the proposed model achieved 100%, 75.1% accuracy and 1.62% relative error, respectively, which is noticeably better than autoencoder-based model (99.7%, 66.1%, 1.95%), and also comparable with those of supervised models (100%, 87.3%, 0.71%).
翻译:从被动学习机制(例如,监督学习)中产生的当前AI系统的内在局限性之一是,这些系统在标签数据集上表现良好,但无法自行推断知识。为了解决这一问题,我们通过行动从高度有意的学习系统中获得灵感:幼儿。在幼儿学习程序的启发下,我们设计了一个互动代理,在虚拟环境中与对象进行探索和互动的同时,可以学习和存储任务不可知的视觉表现。实验结果显示,这种获得的代表性可以扩大到各种视觉任务,如图像分类、目标本地化和距离估计任务。 具体地说,拟议的模型分别实现了100%、75.1%的准确度和1.62%的相对错误,这明显优于以自动编码器为基础的模型(99.7%、66.1%、1.95%),也与受监督模型的模型(100%、87.3%、0.71%)相仿(100%、87.3%、0.71%)。