探索将神经调整模型纳入多层次排级架构的方法 (An Exploration of Approaches to Integrating Neural Reranking Models in Multi-Stage Ranking Architectures)

We explore different approaches to integrating a simple convolutional neural network (CNN) with the Lucene search engine in a multi-stage ranking architecture. Our models are trained using the PyTorch deep learning toolkit, which is implemented in C/C++ with a Python frontend. One obvious integration strategy is to expose the neural network directly as a service. For this, we use Apache Thrift, a software framework for building scalable cross-language services. In exploring alternative architectures, we observe that once trained, the feedforward evaluation of neural networks is quite straightforward. Therefore, we can extract the parameters of a trained CNN from PyTorch and import the model into Java, taking advantage of the Java Deeplearning4J library for feedforward evaluation. This has the advantage that the entire end-to-end system can be implemented in Java. As a third approach, we can extract the neural network from PyTorch and "compile" it into a C++ program that exposes a Thrift service. We evaluate these alternatives in terms of performance (latency and throughput) as well as ease of integration. Experiments show that feedforward evaluation of the convolutional neural network is significantly slower in Java, while the performance of the compiled C++ network does not consistently beat the PyTorch implementation.

翻译：我们探索了将简单的进化神经网络(CNN)与Lucene搜索引擎整合到多级排名结构中的不同方法。我们的模型使用PyTorrch深学习工具包进行培训, 该工具包在C/C+++中与 Python 前端实施。一个明显的整合战略是直接暴露神经网络作为服务。为此, 我们使用Apache Thrift, 一个用于建设可扩缩跨语言服务的软件框架。在探索替代结构时, 我们观察到, 一旦经过培训, 对神经网络的进化前评价就相当直截了当。因此, 我们可以利用利用Java Deeplestelening4J 图书馆作为进化前端评价的优势, 将经过训练的CNN参数从PyTorrich中提取出来, 并将该模型输入到爪哇。这有利于整个端到端系统在爪哇实施。作为第三种方法, 我们可以从PyTorrch和“compile”中将神经网络抽取成一个C++方案, 暴露了神经网络的服务。我们从业绩的角度评估这些替代品( 延和通度) 将模型输入到Prevalstut) 并将其作为快速整合网络的快速化网络的快速化过程。