We propose a novel scalable end-to-end pipeline that uses symbolic domain knowledge as constraints for learning a neural network for classifying unlabeled data in a weak-supervised manner. Our approach is particularly well-suited for settings where the data consists of distinct groups (classes) that lends itself to clustering-friendly representation learning and the domain constraints can be reformulated for use of efficient mathematical optimization techniques by considering multiple training examples at once. We evaluate our approach on a variant of the MNIST image classification problem where a training example consists of image sequences and the sum of the numbers represented by the sequences, and show that our approach scales significantly better than previous approaches that rely on computing all constraint satisfying combinations for each training example.
翻译:我们提出一个新的可扩展端对端管道,利用象征性域知识作为制约因素,学习神经网络,以薄弱的监管方式对无标签数据进行分类。 我们的方法特别适合由不同组别(类别)组成的环境,这些组别(类别)有助于进行有利于集群的代言学习,而域限制可以通过同时考虑多个培训实例,重新制定用于高效数学优化技术的域限制。 我们评估了我们关于MNIST图像分类问题的变式方法,在这种变式中,培训范例包括图像序列和序列所代表的数字之和,并表明我们的方法比以往的方法要好得多,以往的方法依赖计算所有制约因素,满足了每个培训实例的组合。