Generating fine-grained, realistic images from text has many applications in the visual and semantic realm. Considering that, we propose Bangla Attentional Generative Adversarial Network (AttnGAN) that allows intensified, multi-stage processing for high-resolution Bangla text-to-image generation. Our model can integrate the most specific details at different sub-regions of the image. We distinctively concentrate on the relevant words in the natural language description. This framework has achieved a better inception score on the CUB dataset. For the first time, a fine-grained image is generated from Bangla text using attentional GAN. Bangla has achieved 7th position among 100 most spoken languages. This inspires us to explicitly focus on this language, which will ensure the inevitable need of many people. Moreover, Bangla has a more complex syntactic structure and less natural language processing resource that validates our work more.
翻译:从文本中生成精细的、现实的图像在视觉和语义学领域有许多应用。 考虑到我们提议孟加拉语“ 注意基因对立网络 ” ( AttnGAN), 允许对高分辨率孟加拉语文本到图像生成进行强化的多阶段处理。 我们的模型可以整合图像中不同分区的最具体细节。 我们明显地集中关注自然语言描述中的相关词。 这个框架在 CUB 数据集上取得了更好的初始分。 第一次, 孟加拉语文本使用注意的 GAN 生成了一个精美的图像。 Bangla 在100种最讲的语言中达到了第7个位置。 这激励我们明确关注这一语言, 这将确保许多人的不可避免的需要。 此外, Bangla 还有一个更复杂的合成结构, 以及更不那么自然的语言处理资源, 更能验证我们的工作。