While Deep Learning (DL) technologies are a promising tool to solve networking problems that map to classification tasks, their computational complexity is still too high with respect to real-time traffic measurements requirements. To reduce the DL inference cost, we propose a novel caching paradigm, that we named approximate-key caching, which returns approximate results for lookups of selected input based on cached DL inference results. While approximate cache hits alleviate DL inference workload and increase the system throughput, they however introduce an approximation error. As such, we couple approximate-key caching with an error-correction principled algorithm, that we named auto-refresh. We analytically model our caching system performance for classic LRU and ideal caches, we perform a trace-driven evaluation of the expected performance, and we compare the benefits of our proposed approach with the state-of-the-art similarity caching -- testifying the practical interest of our proposal.
翻译:虽然深学习(DL)技术是解决与分类任务相匹配的网络问题的一个很有希望的工具,但在实时交通测量要求方面,它们的计算复杂性仍然太高。为了降低 DL 推论成本,我们提出了一个新颖的缓冲模式,即我们命名了近似钥匙缓存,以根据缓存 DL 推论结果对选定输入进行查找的大致结果。虽然近似缓存点击减轻了DL 推算工作量,增加了系统吞吐量。然而,它们引入了一个近似错误。因此,我们将近似键与错误校正原则算法相匹配,我们称之为自动更新。我们用分析模型模拟经典 LRU 和理想缓存的缓存的缓存系统性能,我们对预期的性能进行追踪评估,并将我们拟议方法的效益与最先进的类似缓存量作比较,这证明了我们提案的实际利益。