ICML 2024

Meta Info

Homepage: https://icml.cc/Conferences/2024

Papers

Large Language Models (LLMs)

  • Serving LLMs

    • HexGen: Generative Inference of Foundation Model over Heterogeneous Decentralized Environment [Personal Notes] [arXiv] [Code]

      • HKUST & ETH & CMU

        • Support asymmetric tensor model parallelism and pipeline parallelism under the heterogeneous setting (i.e., each pipeline parallel stage can be assigned with a different number of layers and tensor model parallel degree).

          • Propose a heuristic-based evolutionary algorithm to search for the optimal layout.

    • MuxServe: Flexible Spatial-Temporal Multiplexing for LLM Serving [arXiv] [Code]

      • CUHK & Shanghai AI Lab & HUST & SJTU & PKU & UC Berkeley & UCSD

      • Colocate LLMs considering their popularity to multiplex memory resources.

    • APIServe: Efficient API Support for Large-Language Model Inferencing [arXiv]

      • UCSD

  • Benchmark

    • Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference [arXiv] [Demo]

      • UC Berkeley

  • Speculative decoding

    • Online Speculative Decoding [arXiv]

      • UC Berkeley & UCSD & Sisu Data & SJTU

  • Video generation

    • VideoPoet: A Large Language Model for Zero-Shot Video Generation [Paper] [Homepage]

      • Google & CMU

      • Employ a decoder-only transformer architecture that processes multimodal inputs – including images, videos, text, and audio.

      • The pre-trained LLM is adapted to a range of video generation tasks.

  • Image retrieval

    • MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions [Paper] [Homepage] [Code]

      • OSU & Google DeepMind

      • Enable multimodality-to-image, image-to-image, and text-to-image retrieval.

References

Last updated