Automatically creating the description of an image using any natural languages sentence like English is a very challenging task. It requires expertise of both image processing as well as natural language processing. This paper discuss about different available models for image captioning task. We have also discussed about how the advancement in the task of object recognition and machine translation has greatly improved the performance of image captioning model in recent years. In addition to that we have discussed how this model can be implemented. In the end, we have also evaluated the performance of model using standard evaluation matrices.
翻译:使用英语等任何自然语言句自动创建图像描述,是一项非常艰巨的任务,既需要图像处理的专门知识,也需要自然语言处理的专门知识。本文讨论了用于图像说明任务的不同现有模式。我们还讨论了近年来物体识别和机器翻译任务的进展如何大大提高了图像说明模式的性能。此外,我们还讨论了如何实施这一模式。最后,我们还利用标准评价矩阵对模型的性能进行了评估。