Function-as-a-Service (FaaS) is emerging as an important cloud computing service model as it can improve the scalability and usability of a wide range of applications, especially Machine-Learning (ML) inference tasks that require scalable resources and complex software configurations. These inference tasks heavily rely on GPUs to achieve high performance; however, support for GPUs is currently lacking in the existing FaaS solutions. The unique event-triggered and short-lived nature of functions poses new challenges to enabling GPUs on FaaS, which must consider the overhead of transferring data (e.g., ML model parameters and inputs/outputs) between GPU and host memory. This paper proposes a novel GPU-enabled FaaS solution that enables ML inference functions to efficiently utilize GPUs to accelerate their computations. First, it extends existing FaaS frameworks such as OpenFaaS to support the scheduling and execution of functions across GPUs in a FaaS cluster. Second, it provides caching of ML models in GPU memory to improve the performance of model inference functions and global management of GPU memories to improve cache utilization. Third, it offers co-designed GPU function scheduling and cache management to optimize the performance of ML inference functions. Specifically, the paper proposes locality-aware scheduling, which maximizes the utilization of both GPU memory for cache hits and GPU cores for parallel processing. A thorough evaluation based on real-world traces and ML models shows that the proposed GPU-enabled FaaS works well for ML inference tasks, and the proposed locality-aware scheduler achieves a speedup of 48x compared to the default, load balancing only schedulers.
翻译:函数- a- Service (FaaS) 正在作为一个重要的云计算服务模型出现,它是一个重要的云计算服务模型(FaaS), 因为它能够改善各种应用程序的可缩放性和可用性, 特别是需要可缩放资源和复杂软件配置的机器- Learning(ML) 推导任务。 这些推导任务在很大程度上依赖 GPU 来取得高性能; 但是, 现有的 FaaS 解决方案目前缺乏对 GPUs 的支持。 功能的独特事件触发性和短暂性对使 FaaS 上的 GPUs 能够使 GPS 能够改进数据传输(例如, ML 模型参数和输入/输出数据) 的可缩放管理管理。 第二, 它为GPUS 数据传输的ML 模型存储器存储器管理管理GPLS 的运行情况, 将GPLS 运行运行的运行进度提高到GPS 。</s>