# ASPLOS 2024

## Meta Info

Homepage: <https://asplos-conference.org/2024/>

## Papers

### LLM Inference

* SpotServe: Serving Generative Large Language Models on Preemptible Instances \[[Personal Notes](https://paper.lingyunyang.com/reading-notes/conference/asplos-2024/spotserve)] \[[Paper](https://arxiv.org/abs/2311.15566)] \[[Code](https://github.com/Hsword/SpotServe)]
  * CMU & PKU & CUHK
  * Distributed LLM serving system on preemptible/spot instances
  * Techniques
    * Dynamically adapt the LLM parallelization configuration
    * Minimize the cost of migrating instances for dynamic re-parallelization
    * Stateful inference recovery

### Model Serving

* Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling \[[Paper](https://doi.org/10.1145/3617232.3624849)]
  * UMass-Amherst & Nokia Bell Labs

### Elastic Training

* Heet: Accelerating Elastic Training in Heterogeneous Deep Learning Clusters
  * UMacau
