# SoCC 2024

## Meta Info

Homepage: <https://acmsocc.org/2024/index.html>

Paper list: <https://acmsocc.org/2024/schedule.html>

### Acceptance Rate

30.1% (= 63 / 209)

## Papers

### Large Language Models (LLMs)

* LLM inference
  * Queue Management for SLO-Oriented Large Language Model Serving \[[Paper](https://dl.acm.org/doi/10.1145/3698038.3698523)]
    * UIUC & IBM Research
* LLM training
  * Distributed Training of Large Language Models on AWS Trainium \[[Paper](https://dl.acm.org/doi/10.1145/3698038.3698535)]
    * AWS

### Mixture of Experts (MoEs)

* MoE inference
  * MoEsaic: Shared Mixture of Experts \[[Paper](https://dl.acm.org/doi/10.1145/3698038.3698521)]
    * IBM Research

### GPU Sharing

* KACE: Kernel-Aware Colocation for Efficient GPU Spatial Sharing \[[Paper](https://dl.acm.org/doi/10.1145/3698038.3698555)]
  * Stony Brook University

### Serverless Computing

* On-demand and Parallel Checkpoint/Restore for GPU Applications \[[Paper](https://dl.acm.org/doi/10.1145/3698038.3698563)]
  * SJTU IPADS & Shanghai Artificial Intelligence Research Institute
  * **gCROP**: **G**PU **C**heckpoint/**R**estore made **O**n-demand and **P**arallel

### Resource Scheduler

* Scheduler for deep learning training workloads
  * Hops: Fine-grained heterogeneous sensing, efficient and fair Deep Learning cluster scheduling system \[[Paper](https://dl.acm.org/doi/10.1145/3698038.3698515)]
    * Anhui University & Institute of Artificial Intelligence, Hefei Comprehensive National Science Center

### Distributed Training

* Generative Adversarial Networks (GANs)
  * ParaGAN: A Scalable Distributed Training Framework for Generative Adversarial Networks \[[Paper](https://dl.acm.org/doi/10.1145/3698038.3698563)]
    * NUS
