用于信息检索的生成反向蚊帐:基础和进步 (Generative Adversarial Nets for Information Retrieval: Fundamentals and Advances)

Generative adversarial nets (GANs) have been widely studied during the recent development of deep learning and unsupervised learning. With an adversarial training mechanism, GAN manages to train a generative model to fit the underlying unknown real data distribution under the guidance of the discriminative model estimating whether a data instance is real or generated. Such a framework is originally proposed for fitting continuous data distribution such as images, thus it is not straightforward to be directly applied to information retrieval scenarios where the data is mostly discrete, such as IDs, text and graphs. In this tutorial, we focus on discussing the GAN techniques and the variants on discrete data fitting in various information retrieval scenarios. (i) We introduce the fundamentals of GAN framework and its theoretic properties; (ii) we carefully study the promising solutions to extend GAN onto discrete data generation; (iii) we introduce IRGAN, the fundamental GAN framework of fitting single ID data distribution and the direct application on information retrieval; (iv) we further discuss the task of sequential discrete data generation tasks, e.g., text generation, and the corresponding GAN solutions; (v) we present the most recent work on graph/network data fitting with node embedding techniques by GANs. Meanwhile, we also introduce the relevant open-source platforms such as IRGAN and Texygen to help audience conduct research experiments on GANs in information retrieval. Finally, we conclude this tutorial with a comprehensive summarization and a prospect of further research directions for GANs in information retrieval.

翻译：在最近的深层次学习和不受监督学习的开发过程中,广泛研究了对抗网(GANs)的产生,最近,在深层次学习和不受监督学习的开发过程中,广泛研究了对抗网(GANs),通过对抗性培训机制,GAN设法训练一个基因模型,以适应基础的未知真实数据分布,在评估数据实例是否真实或生成的歧视性模型的指导下,对基础的未知真实数据分布加以估计;这一框架最初是为安装连续的数据发布,如图像,因此直接应用于数据大多离散的信息检索假设,如ID、文本和图解等,并非简单易行;在这个教学阶段,我们侧重于讨论GAN技术的技术和变异功能,以适应各种信息检索设想的离散数据。 (一) 我们引入GAN框架的基本原理及其理论特性;(二) 我们仔细研究有希望的解决方案,将GAN扩展到离散数据的生成上;我们引入IRCAN的基本GAN框架, 直接应用信息检索; (四)我们进一步讨论连续的离散数据生成任务,例如,文本生成,以及相应的GAN数据检索,并且将GAN系统进行我们最新的研究,将GAN最后的模型纳入。