ICML 2024

Meta Info

Homepage: https://icml.cc/Conferences/2024

Papers

Large Language Models (LLMs)

Serving LLMs
- HexGen: Generative Inference of Foundation Model over Heterogeneous Decentralized Environment [Personal Notes] [arXiv] [Code]
  - HKUST & ETH & CMU
    Support asymmetric tensor model parallelism and pipeline parallelism under the heterogeneous setting (i.e., each pipeline parallel stage can be assigned with a different number of layers and tensor model parallel degree).
    Propose a heuristic-based evolutionary algorithm to search for the optimal layout.
- MuxServe: Flexible Spatial-Temporal Multiplexing for LLM Serving [arXiv] [Code]
  - CUHK & Shanghai AI Lab & HUST & SJTU & PKU & UC Berkeley & UCSD
  - Colocate LLMs considering their popularity to multiplex memory resources.
- APIServe: Efficient API Support for Large-Language Model Inferencing [arXiv]
  - UCSD
Benchmark
- Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference [arXiv] [Demo]
  - UC Berkeley
Speculative decoding
- Online Speculative Decoding [arXiv]
  - UC Berkeley & UCSD & Sisu Data & SJTU
Video generation
- VideoPoet: A Large Language Model for Zero-Shot Video Generation [Paper] [Homepage]
  - Google & CMU
  - Employ a decoder-only transformer architecture that processes multimodal inputs – including images, videos, text, and audio.
  - The pre-trained LLM is adapted to a range of video generation tasks.
Image retrieval
- MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions [Paper] [Homepage] [Code]
  - OSU & Google DeepMind
  - Enable multimodality-to-image, image-to-image, and text-to-image retrieval.

References

Google DeepMind at ICML 2024, 2024/07/19

Last updated 1 year ago

Was this helpful?