Learning general-purpose representations from perceptual inputs is a hallmark of human intelligence. For example, people can write out numbers or characters, or even draw doodles, by characterizing these tasks as different instantiations of the same generic underlying process -- compositional arrangements of different forms of pen strokes. Crucially, learning to do one task, say writing, implies reasonable competence at another, say drawing, on account of this shared process. We present Drawing out of Distribution (DooD), a neuro-symbolic generative model of stroke-based drawing that can learn such general-purpose representations. In contrast to prior work, DooD operates directly on images, requires no supervision or expensive test-time inference, and performs unsupervised amortised inference with a symbolic stroke model that better enables both interpretability and generalization. We evaluate DooD on its ability to generalise across both data and tasks. We first perform zero-shot transfer from one dataset (e.g. MNIST) to another (e.g. Quickdraw), across five different datasets, and show that DooD clearly outperforms different baselines. An analysis of the learnt representations further highlights the benefits of adopting a symbolic stroke model. We then adopt a subset of the Omniglot challenge tasks, and evaluate its ability to generate new exemplars (both unconditionally and conditionally), and perform one-shot classification, showing that DooD matches the state of the art. Taken together, we demonstrate that DooD does indeed capture general-purpose representations across both data and task, and takes a further step towards building general and robust concept-learning systems.
翻译:从感知投入中学习一般目的的表示方式是人类智力的标志。例如,人们可以写出数字或字符,甚至画面条,将这些任务描述为同一通用基本过程的不同即时性 -- -- 不同笔笔笔笔笔笔笔的构成安排。关键是,学会做一个任务,比如写,意味着另一个任务的合理能力,比如,根据这个共享过程进行抽取。我们首先从一个数据集(DooD),一个神经同步的中风绘图模型,可以学习这样的一般目的表示。与以前的工作相比,DooD直接操作图像,不需要监督或昂贵的测试时间推断,并且用一个象征性的中风模型进行不受监督的摊销性推论,这能更好地解释和概括。我们评估DOoD是否有能力在数据和任务之间进行概括化。我们首先从一个数据集(例如,MNIST)向另一个数据集(例如,快速拉线)进行零转换,在五个不同的数据集上直接操作,不需要监督或花费昂贵的测试时间推算,并且显示D在另一个概念上明显地展示一个基准,我们通过一个方向进行新的分析。