Gemini: Enabling multi-tenant GPU sharing based on kernel burst estimation
#GPU_sharing #GPU_time_sharing #API_remoting #kernel_burst
Last updated
Was this helpful?
#GPU_sharing #GPU_time_sharing #API_remoting #kernel_burst
Last updated
Was this helpful?
Presented in .
Authors: Hung-Hsin Chen, En-Te Lin, Yu-Min Chou, Jerry Chou (National Tsing Hua University).
Code:
This paper proposes Gemini, a user-space runtime scheduling framework to enable fine-grained GPU allocation control with support.
Introduce Kernel burst, a group of consecutive kernels launched together without being interrupted by synchronous events.
Typical GPU programming model
copy data to GPU device memory
launch CUDA kernels without data dependency
wait for kernels to complete
copy results back to CPU host memory
Propose a low overhead event-driven monitor and a dynamic time-sharing scheduler.