Most of the internet today is composed of digital media that includes videos and images. With pixels becoming the currency in which most transactions happen on the internet, it is becoming increasingly important to have a way of browsing through this ocean of information with relative ease. YouTube has 400 hours of video uploaded every minute and many million images are browsed on Instagram, Facebook, etc. Inspired by recent advances in the field of deep learning and success that it has gained on various problems like image captioning and, machine translation , word2vec , skip thoughts, etc, we present DeepSeek a natural language processing based deep learning model that allows users to enter a description of the kind of images that they want to search, and in response the system retrieves all the images that semantically and contextually relate to the query. Two approaches are described in the following sections.
翻译:今天的互联网大多由数字媒体组成,其中包括视频和图像。随着像素成为互联网上大多数交易的货币,以相对轻松的方式浏览这一信息海洋变得越来越重要。YouTube每分钟有400小时的视频上传,在Instagram、脸书等上浏览了上百万张图像。 最近在深层次学习和成功领域取得了一些进步,例如图像字幕、机器翻译、Word2vec、跳过思考等,我们向DeepSeek展示了一种基于自然语言的深层学习处理模型,让用户能够输入他们想要搜索的图像的描述,作为回应,系统检索了与查询有关的语义和背景的所有图像。以下各节描述了两种方法。