In classification tasks, the classification accuracy diminishes when the data is gathered in different domains. To address this problem, in this paper, we investigate several adversarial models for domain adaptation (DA) and their effect on the acoustic scene classification task. The studied models include several types of generative adversarial networks (GAN), with different loss functions, and the so-called cycle GAN which consists of two interconnected GAN models. The experiments are performed on the DCASE20 challenge task 1A dataset, in which we can leverage the paired examples of data recorded using different devices, i.e., the source and target domain recordings. The results of performed experiments indicate that the best performing domain adaptation can be obtained using the cycle GAN, which achieves as much as 66% relative improvement in accuracy for the target domain device, while only 6\% relative decrease in accuracy on the source domain. In addition, by utilizing the paired data examples, we are able to improve the overall accuracy over the model trained using larger unpaired data set, while decreasing the computational cost of the model training.

### 相关内容

Weakly-supervised learning has become a popular technology in recent years. In this paper, we propose a novel medical image classification algorithm, called Weakly-Supervised Generative Adversarial Networks (WSGAN), which only uses a small number of real images without labels to generate fake images or mask images to enlarge the sample size of the training set. First, we combine with MixMatch to generate pseudo labels for the fake images and unlabeled images to do the classification. Second, contrastive learning and self-attention mechanism are introduced into the proposed problem to enhance the classification accuracy. Third, the problem of mode collapse is well addressed by cyclic consistency loss. Finally, we design global and local classifiers to complement each other with the key information needed for classification. The experimental results on four medical image datasets show that WSGAN can obtain relatively high learning performance by using few labeled and unlabeled data. For example, the classification accuracy of WSGAN is 11% higher than that of the second-ranked MIXMATCH with 100 labeled images and 1000 unlabeled images on the OCT dataset. In addition, we also conduct ablation experiments to verify the effectiveness of our algorithm.

While existing work in robust deep learning has focused on small pixel-level $\ell_p$ norm-based perturbations, this may not account for perturbations encountered in several real world settings. In many such cases although test data might not be available, broad specifications about the types of perturbations (such as an unknown degree of rotation) may be known. We consider a setup where robustness is expected over an unseen test domain that is not i.i.d. but deviates from the training domain. While this deviation may not be exactly known, its broad characterization is specified a priori, in terms of attributes. We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space, without having access to the data from the test domain. Our adversarial training solves a min-max optimization problem, with the inner maximization generating adversarial perturbations, and the outer minimization finding model parameters by optimizing the loss on adversarial perturbations generated from the inner maximization. We demonstrate the applicability of our approach on three types of naturally occurring perturbations -- object-related shifts, geometric transformations, and common image corruptions. Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations. We demonstrate the usefulness of the proposed approach by showing the robustness gains of deep neural networks trained using our adversarial training on MNIST, CIFAR-10, and a new variant of the CLEVR dataset.

Convolutional networks (ConvNets) have achieved great successes in various challenging vision tasks. However, the performance of ConvNets would degrade when encountering the domain shift. The domain adaptation is more significant while challenging in the field of biomedical image analysis, where cross-modality data have largely different distributions. Given that annotating the medical data is especially expensive, the supervised transfer learning approaches are not quite optimal. In this paper, we propose an unsupervised domain adaptation framework with adversarial learning for cross-modality biomedical image segmentations. Specifically, our model is based on a dilated fully convolutional network for pixel-wise prediction. Moreover, we build a plug-and-play domain adaptation module (DAM) to map the target input to features which are aligned with source domain feature space. A domain critic module (DCM) is set up for discriminating the feature space of both domains. We optimize the DAM and DCM via an adversarial loss without using any target domain label. Our proposed method is validated by adapting a ConvNet trained with MRI images to unpaired CT data for cardiac structures segmentations, and achieved very promising results.

In this paper, we propose the Cross-Domain Adversarial Auto-Encoder (CDAAE) to address the problem of cross-domain image inference, generation and transformation. We make the assumption that images from different domains share the same latent code space for content, while having separate latent code space for style. The proposed framework can map cross-domain data to a latent code vector consisting of a content part and a style part. The latent code vector is matched with a prior distribution so that we can generate meaningful samples from any part of the prior space. Consequently, given a sample of one domain, our framework can generate various samples of the other domain with the same content of the input. This makes the proposed framework different from the current work of cross-domain transformation. Besides, the proposed framework can be trained with both labeled and unlabeled data, which makes it also suitable for domain adaptation. Experimental results on data sets SVHN, MNIST and CASIA show the proposed framework achieved visually appealing performance for image generation task. Besides, we also demonstrate the proposed method achieved superior results for domain adaptation. Code of our experiments is available in https://github.com/luckycallor/CDAAE.

Domain Adaptation is an actively researched problem in Computer Vision. In this work, we propose an approach that leverages unsupervised data to bring the source and target distributions closer in a learned joint feature space. We accomplish this by inducing a symbiotic relationship between the learned embedding and a generative adversarial network. This is in contrast to methods which use the adversarial framework for realistic data generation and retraining deep models with such data. We demonstrate the strength and generality of our approach by performing experiments on three different tasks with varying levels of difficulty: (1) Digit classification (MNIST, SVHN and USPS datasets) (2) Object recognition using OFFICE dataset and (3) Domain adaptation from synthetic to real data. Our method achieves state-of-the art performance in most experimental settings and by far the only GAN-based method that has been shown to work well across different datasets such as OFFICE and DIGITS.

In this paper, we propose an improved quantitative evaluation framework for Generative Adversarial Networks (GANs) on generating domain-specific images, where we improve conventional evaluation methods on two levels: the feature representation and the evaluation metric. Unlike most existing evaluation frameworks which transfer the representation of ImageNet inception model to map images onto the feature space, our framework uses a specialized encoder to acquire fine-grained domain-specific representation. Moreover, for datasets with multiple classes, we propose Class-Aware Frechet Distance (CAFD), which employs a Gaussian mixture model on the feature space to better fit the multi-manifold feature distribution. Experiments and analysis on both the feature level and the image level were conducted to demonstrate improvements of our proposed framework over the recently proposed state-of-the-art FID method. To our best knowledge, we are the first to provide counter examples where FID gives inconsistent results with human judgments. It is shown in the experiments that our framework is able to overcome the shortness of FID and improves robustness. Code will be made available.

Jiawei Mao,Xuesong Yin,Yuanqi Chang,Qi Huang,Daoqiang Zhang,Jieyue Yu,Yigang Wang
0+阅读 · 11月29日
Kazuma Fujii,Hiroshi Kera,Kazuhiko Kawamoto
0+阅读 · 11月25日
Bowen Cai,Huan Fu,Rongfei Jia,Binqiang Zhao,Hua Li,Yinghui Xu
9+阅读 · 2020年12月10日
Tejas Gokhale,Rushil Anirudh,Bhavya Kailkhura,Jayaraman J. Thiagarajan,Chitta Baral,Yezhou Yang
15+阅读 · 2020年12月3日
Yawei Luo,Ping Liu,Tao Guan,Junqing Yu,Yi Yang
3+阅读 · 2020年4月13日
Chengxiang Yin,Jian Tang,Zhiyuan Xu,Yanzhi Wang
6+阅读 · 2018年6月8日
Qi Dou,Cheng Ouyang,Cheng Chen,Hao Chen,Pheng-Ann Heng
10+阅读 · 2018年4月29日
Haodi Hou,Jing Huo,Yang Gao
4+阅读 · 2018年4月17日
Swami Sankaranarayanan,Yogesh Balaji,Carlos D. Castillo,Rama Chellappa
3+阅读 · 2018年4月1日
Shaohui Liu,Yi Wei,Jiwen Lu,Jie Zhou
3+阅读 · 2018年3月27日

6+阅读 · 2020年4月8日
CreateAMind
12+阅读 · 2019年5月22日
CreateAMind
8+阅读 · 2019年5月18日

6+阅读 · 2019年1月11日
CreateAMind
7+阅读 · 2019年1月7日
CreateAMind
32+阅读 · 2019年1月3日
CreateAMind
7+阅读 · 2017年10月4日
CreateAMind
5+阅读 · 2017年8月4日
Top