ASPLOS 2024
Meta Info
Homepage: https://asplos-conference.org/2024/
Papers
LLM Inference
SpotServe: Serving Generative Large Language Models on Preemptible Instances [Personal Notes] [Paper] [Code]
CMU & PKU & CUHK
Distributed LLM serving system on preemptible/spot instances
Techniques
Dynamically adapt the LLM parallelization configuration
Minimize the cost of migrating instances for dynamic re-parallelization
Stateful inference recovery
Model Serving
Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling [Paper]
UMass-Amherst & Nokia Bell Labs
Elastic Training
Heet: Accelerating Elastic Training in Heterogeneous Deep Learning Clusters
UMacau
Last updated