The recent advanced deep learning techniques have shown the promising results in various domains such as computer vision and natural language processing. The success of deep neural networks in supervised learning heavily relies on a large amount of labeled data. However, obtaining labeled data with target labels is often challenging due to various reasons such as cost of labeling and privacy issues, which challenges existing deep models. In spite of that, it is relatively easy to obtain data with \textit{inexact supervision}, i.e., having labels/tags related to the target task. For example, social media platforms are overwhelmed with billions of posts and images with self-customized tags, which are not the exact labels for target classification tasks but are usually related to the target labels. It is promising to leverage these tags (inexact supervision) and their relations with target classes to generate labeled data to facilitate the downstream classification tasks. However, the work on this is rather limited. Therefore, we study a novel problem of labeled data generation with inexact supervision. We propose a novel generative framework named as ADDES which can synthesize high-quality labeled data for target classification tasks by learning from data with inexact supervision and the relations between inexact supervision and target classes. Experimental results on image and text datasets demonstrate the effectiveness of the proposed ADDES for generating realistic labeled data from inexact supervision to facilitate the target classification task.
翻译:最近先进的深层学习技术显示,在计算机视觉和自然语言处理等不同领域取得了有希望的成果。例如,深神经网络在监督下学习的成功在很大程度上依赖于大量贴标签数据。然而,由于标签成本和隐私问题等各种原因,获得贴上目标标签的数据往往具有挑战性,这挑战了现有的深层模型。尽管如此,获得与目标任务相关的标签/标签数据相对容易,例如,与目标任务相关联的标签/标签。例如,社交媒体平台充斥着数十亿个带有自定制标签的海报和图像,这些标签不是目标分类任务的确切标签,但通常与目标标签有关。有希望利用这些标签(不严格的监督)及其与目标类别的关系来生成标签数据,以便利下游的分类任务。然而,这方面的工作相当有限。因此,我们研究一个新颖的贴标签数据生成问题,以不精确的分类为基础,以ADDES命名为新基因化框架,用于在目标分类工作中对高质量标签数据进行综合,并在目标分类中,通过学习数据实验性数据,促进在目标分类中的数据和实验性任务中,从而显示数据在目标关系上生成结果。