ICML 2024
Meta Info
Homepage: https://icml.cc/Conferences/2024
Papers
Large Language Models (LLMs)
- Serving LLMs - HexGen: Generative Inference of Foundation Model over Heterogeneous Decentralized Environment [Personal Notes] [arXiv] [Code] - HKUST & ETH & CMU - Support asymmetric tensor model parallelism and pipeline parallelism under the heterogeneous setting (i.e., each pipeline parallel stage can be assigned with a different number of layers and tensor model parallel degree). - Propose a heuristic-based evolutionary algorithm to search for the optimal layout. 
 
 
 
- APIServe: Efficient API Support for Large-Language Model Inferencing [arXiv] - UCSD 
 
 
- Speculative decoding - Online Speculative Decoding [arXiv] - UC Berkeley & UCSD & Sisu Data & SJTU 
 
 
- Video generation 
References
Last updated
Was this helpful?