EuroSys 2025

Meta Info

Homepage: https://2025.eurosys.org

Paper list: https://2025.eurosys.org/accepted-papers.html

Papers

Large Language Models (LLMs)

  • LLM inference

    • Fast State Restoration in LLM Serving with HCache

      • THU

    • Stateful Large Language Model Serving with Pensieve

      • NYU

    • CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge Fusion

      • CUHK-Shenzhen & UChicago & Stanford

    • T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge

      • USTC & MSRA

  • LLM fine-tuning

    • HybridFlow: A Flexible and Efficient RLHF Framework

      • HKU & ByteDance

Distributed Training

  • JABAS: Joint Adaptive Batching and Automatic Scaling for DNN Training on Heterogeneous GPUs

    • UNIST & Samsung

  • FlowCheck: Decoupling Checkpointing and Training of Large-Scale Models

    • SJTU & Alibaba Cloud

ML Inference

  • A House United Within Itself: SLO-Awareness for On-Premises Containerized ML Inference Clusters via Faro

    • UIUC & IBM Research

Serverless Computing

  • Serverless Cold Starts and Where to Find Them

    • Huawei

GPU Sharing

  • Improving GPU Sharing Performance through Adaptive Bubbleless Spatial-Temporal Sharing

    • SJTU & Microsoft & Alibaba

  • Multiplexing Dynamic Deep Learning Workloads with SLO-awareness in GPU Clusters

    • University of Macau & SIAT, CAS

Acronyms

  • RLHF: Reinforcement Learning from Human Feedback

  • ML: Machine Learning

Last updated