Full metadata
Title
GPU-enabled Functional-as-a-Service
Description
Function-as-a-Service (FaaS) is emerging as an important cloud computing service model as it can improve scalability and usability for a wide range of applications, especially Machine-Learning (ML) inference tasks that require scalable computation resources and complicated configurations. Many applications, including ML inference, rely on Graphics-Processing-Unit (GPU) to achieve high performance; however, support for GPUs is currently lacking in existing FaaS solutions. The unique event-triggered and short-lived nature of functions poses new challenges to enabling GPUs on FaaS which must consider the overhead of transferring data (e.g., ML model parameters and inputs/outputs) between GPU and host memory. This thesis presents a new GPU-enabled FaaS solution that enables functions to efficiently utilize GPUs to accelerate computations such as model inference. First, the work extends existing open-source FaaS frameworks such as OpenFaaS to support the scheduling and execution of functions across GPUs in a FaaS cluster. Second, it provides caching of ML models in GPU memory to improve the performance of model inference functions and global management of GPU memories to improve the cache utilization. Third, it offers co-designed GPU function scheduling and cache management to optimize the performance of ML inference functions. Specifically, the thesis proposes locality-aware scheduling which maximizes the utilization of both GPU memory for cache hits and GPU cores for parallel processing. A thorough evaluation based on real-world traces and ML models shows that the proposed GPU-enabled FaaS works well for ML inference tasks, and the proposed locality-aware scheduler achieves a speedup of 34x compared to the default, load-balancing only scheduler.
Date Created
2022
Contributors
- Hong, Sungho (Author)
- Zhao, Ming (Thesis advisor)
- Cao, Zhichao (Committee member)
- Sarwat, Mohamed (Committee member)
- Arizona State University (Publisher)
Topical Subject
Resource Type
Extent
60 pages
Language
eng
Copyright Statement
In Copyright
Primary Member of
Peer-reviewed
No
Open Access
No
Handle
https://hdl.handle.net/2286/R.2.N.171964
Level of coding
minimal
Cataloging Standards
Note
Partial requirement for: M.S., Arizona State University, 2022
Field of study: Computer Science
System Created
- 2022-12-20 06:19:18
System Modified
- 2022-12-20 06:19:18
- 1 year 11 months ago
Additional Formats