Training robust supervised deep learning models for many geospatial applications of computer vision is difficult due to dearth of class-balanced and diverse training data. Conversely, obtaining enough training data for many applications is financially prohibitive or may be infeasible, especially when the application involves modeling rare or extreme events. Synthetically generating data (and labels) using a generative model that can sample from a target distribution and exploit the multi-scale nature of images can be an inexpensive solution to address scarcity of labeled data. Towards this goal, we present a deep conditional generative model, called VAE-Info-cGAN, that combines a Variational Autoencoder (VAE) with a conditional Information Maximizing Generative Adversarial Network (InfoGAN), for synthesizing semantically rich images simultaneously conditioned on a pixel-level condition (PLC) and a macroscopic feature-level condition (FLC). Dimensionally, the PLC can only vary in the channel dimension from the synthesized image and is meant to be a task-specific input. The FLC is modeled as an attribute vector in the latent space of the generated image which controls the contributions of various characteristic attributes germane to the target distribution. Experiments on a GPS trajectories dataset show that the proposed model can accurately generate various forms of spatiotemporal aggregates across different geographic locations while conditioned only on a raster representation of the road network. The primary intended application of the VAE-Info-cGAN is synthetic data (and label) generation for targeted data augmentation for computer vision-based modeling of problems relevant to geospatial analysis and remote sensing.
翻译:由于缺少课堂平衡和多样化的培训数据,因此很难为许多计算机视觉地理空间应用提供可靠的深层次培训模型。相反,许多应用获得足够的培训数据在财务上令人望而却步,或者可能不可行,特别是当应用涉及模拟稀有或极端事件时。同时生成数据(和标签),同时使用一个能够从目标分布中取样并利用图像的多尺度性质的基因化模型,可以作为解决标签数据稀缺的廉价解决办法。为此,我们提供了一种深层次的有条件的合成模型,称为VAE-Info-cGAN,该模型将自动自动显示器(VAE)与一个有条件的信息优化基因反转动网络(InfoGAN)结合起来,用于同时合成成像素级分布和多尺度性图像(FLC),用于合成图像的频道层面,PLCFLC只能从合成图像的深度代表层面图像应用模式,用于任务化目标目标应用。FLCA目标A目标A目标A目标网的定位图象值定位分析,用于将各种磁带的图像的图像分析,而将各种磁带图像的模型显示为不同的磁带。