Gemini: Enabling multi-tenant GPU sharing based on kernel burst estimation
#GPU_sharing #GPU_time_sharing #API_remoting #kernel_burst
Meta Info
Presented in TCC 2021.
Authors: Hung-Hsin Chen, En-Te Lin, Yu-Min Chou, Jerry Chou (National Tsing Hua University).
Code: https://github.com/NTHU-LSALAB/Gemini
Understanding the paper
This paper proposes Gemini, a user-space runtime scheduling framework to enable fine-grained GPU allocation control with support.
Introduce Kernel burst, a group of consecutive kernels launched together without being interrupted by synchronous events.
Typical GPU programming model
copy data to GPU device memory
launch CUDA kernels without data dependency
wait for kernels to complete
copy results back to CPU host memory
Propose a low overhead event-driven monitor and a dynamic time-sharing scheduler.
Last updated