# ICML 2024

## Meta Info

Homepage: <https://icml.cc/Conferences/2024>

### Papers

### Large Language Models (LLMs)

* Serving LLMs
  * HexGen: Generative Inference of Foundation Model over Heterogeneous Decentralized Environment \[[Personal Notes](/reading-notes/miscellaneous/arxiv/2023/hexgen.md)] \[[arXiv](https://arxiv.org/abs/2311.11514)] \[[Code](https://github.com/Relaxed-System-Lab/HexGen)]
    * HKUST & ETH & CMU
      * Support *asymmetric* tensor model parallelism and pipeline parallelism under the *heterogeneous* setting (i.e., each pipeline parallel stage can be assigned with a different number of layers and tensor model parallel degree).
        * Propose *a heuristic-based evolutionary algorithm* to search for the optimal layout.
  * MuxServe: Flexible Spatial-Temporal Multiplexing for LLM Serving \[[arXiv](https://arxiv.org/abs/2404.02015)] \[[Code](https://github.com/hao-ai-lab/MuxServe)]
    * CUHK & Shanghai AI Lab & HUST & SJTU & PKU & UC Berkeley & UCSD
    * Colocate LLMs considering their popularity to multiplex memory resources.
  * APIServe: Efficient API Support for Large-Language Model Inferencing \[[arXiv](https://arxiv.org/abs/2402.01869)]
    * UCSD
* Benchmark
  * Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference \[[arXiv](https://www.google.com/url?sa=t\&source=web\&rct=j\&opi=89978449\&url=https://arxiv.org/abs/2403.04132\&ved=2ahUKEwinqvnbiruHAxWZmO4BHQAfAaMQFnoECAgQAQ\&usg=AOvVaw0xl2m0cvjY2iAKescRSm3P)] \[[Demo](https://chat.lmsys.org)]
    * UC Berkeley
* Speculative decoding
  * Online Speculative Decoding \[[arXiv](https://arxiv.org/abs/2310.07177)]
    * UC Berkeley & UCSD & Sisu Data & SJTU
* Video generation
  * VideoPoet: A Large Language Model for Zero-Shot Video Generation \[[Paper](https://proceedings.mlr.press/v235/kondratyuk24a.html)] \[[Homepage](https://sites.research.google/videopoet/)]
    * Google & CMU
    * Employ a decoder-only transformer architecture that processes multimodal inputs – including images, videos, text, and audio.
    * The pre-trained LLM is adapted to a range of video generation tasks.
* Image retrieval
  * MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions \[[Paper](https://proceedings.mlr.press/v235/zhang24an.html)] \[[Homepage](https://open-vision-language.github.io/MagicLens/)] \[[Code](https://github.com/google-deepmind/magiclens)]
    * OSU & Google DeepMind
    * Enable multimodality-to-image, image-to-image, and text-to-image retrieval.

## References

* [Google DeepMind at ICML 2024, 2024/07/19](https://deepmind.google/discover/blog/google-deepmind-at-icml-2024/)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://paper.lingyunyang.com/reading-notes/conference/icml-2024.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
