# GaiaGPU: Sharing GPUs in container clouds

## Metadata

Presented in [ISPA/IUCC/BDCloud/SocialCom/SustainCom 2018](https://ieeexplore.ieee.org/document/8672318/).

Authors: Jing Gu, Shengbo Song, Ying Li, Hanmei Luo

### Code

* vcuda-controller: <https://github.com/tkestack/vcuda-controller>
* GPU admission: <https://github.com/tkestack/gpu-admission>
* GPU Manager: <https://github.com/tkestack/gpu-manager>

## Understanding the paper

### TL;DR

This paper presents an approach named **GaiaGPU**, to **share GPU memory and computing resources** among containers. It gives a solution for providing GPU sharing in the cloud.

### Technical details

![The architecture of GaiaGPU](https://819228986-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MkzeiawY8SkBarQBDVm-659326392%2Fuploads%2Fgit-blob-1f9ce8dda0020791c00b6627c7cffd32e1807178%2Fgaiagpu-architecture.png?alt=media)

* **The vGPU Library** running in the container is used to manage the GPU resources.
  * It intercepts the memory-related APIs and the computing-related APIs in the CUDA Library by the `LD_LIBRARY_PATH` mechanism. 12 CUDA Driver APIs are intercepted.

![The intercepted CUDA Driver APIs](https://819228986-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MkzeiawY8SkBarQBDVm-659326392%2Fuploads%2Fgit-blob-cdfc5b5963a9dd994ccba1fb346be1afa0743d7a%2Fgaiagpu-intercepted-cuda-driver-apis.png?alt=media)

* Two allocation methods are adopted to improve the utilization.
  1. **Elastic resource allocation**: **temporarily** modify the computing resource limit of the container, soft limit. The max utilization of GPU is set as a parameter and the default value is **90%**.
  2. **Dynamic resource allocation**: **permanently** modify the resource allocation (memory, computing resource) of the container, hard limit.

### Limitations

* The experimental part is rudimentary; only the micro-benchmark has been done.
* The effect of elastic allocation of computing resources does not seem to be particularly stable.
