FPGAs are quickly becoming available in the cloud as a one more heterogeneous processing element complementing CPUs and GPUs. There are many reports in the literature showing the potential for FPGAs to accelerate a wide variety of algorithms, which combined with their growing availability, would seem to also indicate a widespread use in many applications. Unfortunately, there is not much published research exploring what it takes to integrate an FPGA into an existing application in a cost-effective way and keeping the algorithmic performance advantages. Building on recent results exploring how to employ FPGAs to improve the search engines used in the travel industry, this paper analyses the end-to-end performance of the search engine when using FPGAs, as well as the necessary changes to the software and the cost of such deployments. The results provide important insights on current FPGA deployments and what needs to be done to make FPGAs more widely used. For instance, the large potential performance gains provided by an FPGA are greatly diminished in practice if the application cannot submit request in the most optimal way, something that is not always possible and might require significant changes to the application. Similarly, some existing cloud deployments turn out to use a very imbalanced architecture: a powerful FPGA connected to a not so powerful CPU. The result is that the CPU cannot generate enough load for the FPGA, which potentially eliminates all performance gains and might even result in a more expensive system. In this paper, we report on an extensive study and development effort to incorporate FPGAs into a search engine and analyse the issues encountered and their practical impact. We expect that these results will inform the development and deployment of FPGAs in the future by providing important insights on the end-to-end integration of FPGAs within existing systems.
翻译:在云层中快速提供FPGA,作为一个更加多样化的处理功能,补充CPU和GPU。文献中有许多报告显示,FPGA有可能加速多种算法,而这种算法随着其日益普及,似乎也表明在许多应用中广泛使用。不幸的是,没有多少出版物研究探索如何以具有成本效益的方式将FPGA纳入现有的应用程序,并保持算法性业绩优势。根据最近的成果,探讨如何利用FPGA改进旅行业使用的搜索引擎,本文分析了在使用FPGA时搜索引擎的端到端性能,以及搜索引擎对加速各种算法的快速到端性能的潜力。结果显示,目前FPGA的部署和成本也表明,目前对FPA的部署和成本的大幅变化,我们无法将这种巨大的成本化结果转化为成本化系统内部的快速增长,因此,我们无法将现有的成本化、成本化、成本化和成本化系统内部的大幅变化。我们无法将当前的成本化结果转化为成本化的系统。