Access transparency means that both local and remote resources are accessed using identical operations. With transparency, unmodified single-machine applications could run over disaggregated compute, storage, and memory resources. Hiding the complexity of distributed systems through transparency would have great benefits, like scaling-out local-parallel scientific applications over flexible disaggregated resources. This paper presents a performance evaluation where we assess the feasibility of access transparency over state-of-the-art Cloud disaggregated resources for Python multiprocessing applications. We have interfaced the multiprocessing module with an implementation that transparently runs processes on serverless functions and uses an in-memory data store for shared state. To evaluate transparency, we run in the Cloud four unmodified applications: Uber Research's Evolution Strategies, Baselines-AI's Proximal Policy Optimization, Pandaral.lel's dataframe, and ScikitLearn's Hyperparameter tuning. We compare execution time and scalability of the same application running over disaggregated resources using our library, with the single-machine Python libraries in a large VM. Despite the significant overheads of remote communication, we achieve comparable results and we observe that the applications can continue to scale beyond VM limited resources leading to a better speedup and parallelism without changing the underlying code or application architecture.
翻译:访问的透明性意味着本地和远程资源都可以使用相同的操作。 有了透明度, 未经修改的单机应用程序可以运行到分解的计算、 存储和记忆资源上。 通过透明度来掩盖分布式系统的复杂性将带来巨大的好处, 比如通过灵活的分解资源来扩大本地平行的科学应用。 本文展示了一种绩效评估, 我们评估了使用最先进的云分解资源进行Python多处理应用程序的存取透明度的可行性。 我们将多处理模块与一个透明运行无服务器功能程序并使用一个共享状态的单机数据存储器的实施界面进行了接口。 为了评估透明度, 我们运行于云中四个未经修改的应用程序: Uber 研究的进化战略、 Bases-AI 的普罗克西政策优化化、 Pandaral. Lel 数据框和 ScikitLearn 的超声谱调调应用软件。 我们比较了使用我们的图书馆在分解资源上运行同一应用程序的执行时间和可缩缩放性, 以及使用单机 Python 图书馆在大型 VM 。 尽管我们的远程通信应用在不具有显著的顶部位,我们能够观测到比得更好的应用。