大佬们有没有深度学习关于小样本零样本的新文章推荐下?

关注者
9
被浏览
9,951
登录后你可以
不限量看优质回答私信答主深度交流精彩内容一键收藏

Zero-shot learning - 零次学习

1. 【Zero-shot learning】Let's Transfer Transformations of Shared Semantic Representations

【零次学习】让我们转移共享语义表示的转换

作者:Nam Vo, Lu Jiang, James Hays

链接:

arxiv.org/abs/1903.0079

代码:

github.com/lugiavn/tran

github.com/gchb2012/VQA

英文摘要:

With a good image understanding capability, can we manipulate the images high level semantic representation? Such transformation operation can be used to generate or retrieve similar images but with a desired modification (for example changing beach background to street background); similar ability has been demonstrated in zero shot learning, attribute composition and attribute manipulation image search. In this work we show how one can learn transformations with no training examples by learning them on another domain and then transfer to the target domain. This is feasible if: first, transformation training data is more accessible in the other domain and second, both domains share similar semantics such that one can learn transformations in a shared embedding space. We demonstrate this on an image retrieval task where search query is an image, plus an additional transformation specification (for example: search for images similar to this one but background is a street instead of a beach). In one experiment, we transfer transformation from synthesized 2D blobs image to 3D rendered image, and in the other, we transfer from text domain to natural image domain.

中文摘要:

有了良好的图像理解能力,我们可以操纵图像的高级语义表示吗?这种转换操作可用于生成或检索相似的图像,但需要修改(例如将海滩背景更改为街道背景);在零样本学习、属性组合和属性操作图像搜索中已经证明了类似的能力。在这项工作中,我们展示了如何通过在另一个域上学习转换然后转移到目标域来学习没有训练示例的转换。这在以下情况下是可行的:首先,转换训练数据在另一个域中更易于访问,其次,两个域共享相似的语义,因此可以在共享的嵌入空间中学习转换。我们在图像检索任务中演示了这一点,其中搜索查询是图像,加上额外的转换规范(例如:搜索与此类似但背景是街道而不是海滩的图像)。在一个实验中,我们将转换从合成的2D斑点图像转移到3D渲染图像,在另一个实验中,我们从文本域转移到自然图像域。


2. 【Zero-shot learning】Creativity Inspired Zero-Shot Learning

【零次学习】创意启发的零样本学习

作者:Mohamed Elhoseiny, Mohamed Elfeki

链接:

arxiv.org/abs/1904.0110

代码:

github.com/Elhoseiny-Vi

github.com/mhelhoseiny/

英文摘要:

Zero-shot learning (ZSL) aims at understanding unseen categories with no training examples from class-level descriptions. To improve the discriminative power of zero-shot learning, we model the visual learning process of unseen categories with inspiration from the psychology of human creativity for producing novel art. We relate ZSL to human creativity by observing that zero-shot learning is about recognizing the unseen and creativity is about creating a likable unseen. We introduce a learning signal inspired by creativity literature that explores the unseen space with hallucinated class-descriptions and encourages careful deviation of their visual feature generations from seen classes while allowing knowledge transfer from seen to unseen classes. Empirically, we show consistent improvement over the state of the art of several percents on the largest available benchmarks on the challenging task or generalized ZSL from a noisy text that we focus on, using the CUB and NABirds datasets. We also show the advantage of our approach on Attribute-based ZSL on three additional datasets (AwA2, aPY, and SUN). Code is available.

中文摘要:

零样本学习(ZSL)旨在在没有来自类级描述的训练示例的情况下理解看不见的类别。为了提高零样本学习的辨别力,我们从人类创造力的心理学中对看不见的类别的视觉学习过程进行建模,以生产新颖的艺术。我们通过观察零样本学习是关于识别看不见的事物而将ZSL与人类创造力联系起来,而创造力是关于创造一个可爱的看不见的事物。我们引入了一种受创造力文献启发的学习信号,它通过幻觉类描述探索看不见的空间,并鼓励他们的视觉特征生成与所见类的仔细偏差,同时允许知识从所见类转移到看不见的类。根据经验,我们使用CUB和NABirds数据集在具有挑战性的任务的最大可用基准或从我们关注的嘈杂文本中的广义ZSL上显示出比现有技术持续改进几个百分点。我们还在三个额外的数据集(AwA2、aPY和SUN)上展示了我们的基于属性的ZSL方法的优势。代码可用。


3. 【Zero-shot learning】Leveraging the Invariant Side of Generative Zero-Shot Learning

【零次学习】利用生成式零样本学习的不变性

作者:Jingjing Li, Mengmeng Jin, Ke Lu, Zhengming Ding, Lei Zhu, Zi Huang

链接:

arxiv.org/abs/1904.0409

代码:

github.com/1995subhanka

github.com/lijin118/Lis

英文摘要:

Conventional zero-shot learning (ZSL) methods generally learn an embedding, e.g., visual-semantic mapping, to handle the unseen visual samples via an indirect manner. In this paper, we take the advantage of generative adversarial networks (GANs) and propose a novel method, named leveraging invariant side GAN (LisGAN), which can directly generate the unseen features from random noises which are conditioned by the semantic descriptions. Specifically, we train a conditional Wasserstein GANs in which the generator synthesizes fake unseen features from noises and the discriminator distinguishes the fake from real via a minimax game. Considering that one semantic description can correspond to various synthesized visual samples, and the semantic description, figuratively, is the soul of the generated features, we introduce soul samples as the invariant side of generative zero-shot learning in this paper. A soul sample is the meta-representation of one class. It visualizes the most semantically-meaningful aspects of each sample in the same category. We regularize that each generated sample (the varying side of generative ZSL) should be close to at least one soul sample (the invariant side) which has the same class label with it. At the zero-shot recognition stage, we propose to use two classifiers, which are deployed in a cascade way, to achieve a coarse-to-fine result. Experiments on five popular benchmarks verify that our proposed approach can outperform state-of-the-art methods with significant improvements.

中文摘要:

传统的零样本学习(ZSL)方法通常学习嵌入,例如视觉语义映射,以通过间接方式处理看不见的视觉样本。在本文中,我们利用生成对抗网络(GAN)的优势,提出了一种名为利用不变边GAN(LisGAN)的新方法,该方法可以直接从受语义描述限制的随机噪声中生成看不见的特征。具体来说,我们训练了一个有条件的Wasserstein GAN,其中生成器从噪声中合成假的看不见的特征,而鉴别器通过极小极大游戏区分假的和真实的。考虑到一个语义描述可以对应各种合成的视觉样本,而语义描述形象地是生成特征的灵魂,我们在本文中引入灵魂样本作为生成零样本学习的不变方面。灵魂样本是一个类的元表示。它可视化同一类别中每个样本的最具语义意义的方面。我们正则化每个生成的样本(生成ZSL的变化侧)应该接近至少一个具有相同类别标签的灵魂样本(不变侧)。在零样本识别阶段,我们建议使用两个分类器,它们以级联方式部署,以实现从粗到细的结果。对五个流行基准的实验验证了我们提出的方法可以在显着改进的情况下优于最先进的方法。


4. 【Zero-shot learning】Zero-Shot Semantic Segmentation

【零次学习】零样本语义分割

作者:Maxime Bucher, Tuan-Hung Vu, Matthieu Cord, Patrick Pérez

链接:

arxiv.org/abs/1906.0081

代码:

github.com/yutliu/ZJ_Ze

github.com/valeoai/ZS3

英文摘要:

Semantic segmentation models are limited in their ability to scale to large numbers of object classes. In this paper, we introduce the new task of zero-shot semantic segmentation: learning pixel-wise classifiers for never-seen object categories with zero training examples. To this end, we present a novel architecture, ZS3Net, combining a deep visual segmentation model with an approach to generate visual representations from semantic word embeddings. By this way, ZS3Net addresses pixel classification tasks where both seen and unseen categories are faced at test time (so called "generalized" zero-shot classification). Performance is further improved by a self-training step that relies on automatic pseudo-labeling of pixels from unseen classes. On the two standard segmentation datasets, Pascal-VOC and Pascal-Context, we propose zero-shot benchmarks and set competitive baselines. For complex scenes as ones in the Pascal-Context dataset, we extend our approach by using a graph-context encoding to fully leverage spatial context priors coming from class-wise segmentation maps.

中文摘要:

语义分割模型在扩展到大量对象类的能力方面受到限制。在本文中,我们介绍了零样本语义分割的新任务:使用零训练示例为从未见过的对象类别学习像素级分类器。为此,我们提出了一种新颖的架构ZS3Net,它将深度视觉分割模型与从语义词嵌入中生成视觉表示的方法相结合。通过这种方式,ZS3Net解决了在测试时同时面临可见和不可见类别的像素分类任务(所谓的“广义”零样本分类)。自我训练步骤进一步提高了性能,该步骤依赖于来自看不见的类别的像素的自动伪标记。在两个标准分割数据集Pascal-VOC和Pascal-Context上,我们提出了零样本基准并设置了有竞争力的基线。对于Pascal-Context数据集中的复杂场景,我们通过使用图形上下文编码来扩展我们的方法,以充分利用来自类别分割图的空间上下文先验。


5. 【Zero-shot learning】Compensating Supervision Incompleteness with Prior Knowledge in Semantic Image Interpretation

【零次学习】用语义图像解释中的先验知识补偿监督不完整性

作者:Ivan Donadello, Luciano Serafini

链接:

arxiv.org/abs/1910.0046

代码:

github.com/logictensorn

github.com/ivanDonadell

英文摘要:

Semantic Image Interpretation is the task of extracting a structured semantic description from images. This requires the detection of visual relationships: triples (subject,relation,object) describing a semantic relation between a subject and an object. A pure supervised approach to visual relationship detection requires a complete and balanced training set for all the possible combinations of (subject, relation, object). However, such training sets are not available and would require a prohibitive human effort. This implies the ability of predicting triples which do not appear in the training set. This problem is called zero-shot learning. State-of-the-art approaches to zero-shot learning exploit similarities among relationships in the training set or external linguistic knowledge. In this paper, we perform zero-shot learning by using Logic Tensor Networks, a novel Statistical Relational Learning framework that exploits both the similarities with other seen relationships and background knowledge, expressed with logical constraints between subjects, relations and objects. The experiments on the Visual Relationship Dataset show that the use of logical constraints outperforms the current methods. This implies that background knowledge can be used to alleviate the incompleteness of training sets.

中文摘要:

语义图像解释是从图像中提取结构化语义描述的任务。这需要检测视觉关系:三元组(主体、关系、客体)描述主体和客体之间的语义关系。视觉关系检测的纯监督方法需要一个完整且平衡的训练集,用于所有可能的组合(主题、关系、对象)。然而,这样的训练集是不可用的,并且需要大量的人力。这意味着预测没有出现在训练集中的三元组的能力。这个问题被称为零样本学习。最先进的零样本学习方法利用了训练集中关系或外部语言知识之间的相似性。在本文中,我们使用逻辑张量网络进行零样本学习,这是一种新颖的统计关系学习框架,它利用与其他可见关系和背景知识的相似性,用主题、关系和对象之间的逻辑约束来表示。视觉关系数据集上的实验表明,逻辑约束的使用优于当前方法。这意味着背景知识可以用来减轻训练集的不完整性。


AI&R是人工智能与机器人垂直领域的综合信息平台。我们的愿景是成为通往AGI(通用人工智能)的高速公路,连接人与人、人与信息,信息与信息,让人工智能与机器人没有门槛。

欢迎各位AI与机器人爱好者关注我们,每天给你有深度的内容。

微信搜索公众号【AIandR艾尔】关注我们,获取更多资源❤biubiubiu~