EuroSys 2025

Meta Info

Homepage: https://2025.eurosys.org

Paper list: https://2025.eurosys.org/accepted-papers.html

Acceptance Rate

  • Fall: 8.2% (= 30 / 367)

  • Spring: 10.5% (= 42 / ?)

Papers

Large Language Models (LLMs)

  • LLM Training

    • Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization

      • UofT

    • MEPipe: Democratizing LLM Training with Memory-Efficient Slice-Level Pipeline Scheduling on Cost-Effective Accelerators

      • THU & Zhipu AI

  • LLM Inference

    • Fast State Restoration in LLM Serving with HCache

      • THU

    • Stateful Large Language Model Serving with Pensieve

      • NYU

    • CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge Fusion

      • CUHK-Shenzhen & UChicago & Stanford

    • T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge

      • USTC & MSRA

    • DeltaZip: Efficient Serving of Multiple Full-Model-Tuned LLMs

      • ETH & MIT

    • SpInfer: Leveraging Low-Level Sparsity for Efficient Large Language Model Inference on GPUs

      • HKUST-GZ

  • LLM Fine-Tuning

    • HybridFlow: A Flexible and Efficient RLHF Framework

      • HKU & ByteDance

  • Mixture-of-Experts (MoEs)

    • Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores

      • SJTU

Distributed Training

  • JABAS: Joint Adaptive Batching and Automatic Scaling for DNN Training on Heterogeneous GPUs

    • UNIST & Samsung

  • FlowCheck: Decoupling Checkpointing and Training of Large-Scale Models

    • SJTU & Alibaba Cloud

  • Comprehensive Deadlock Prevention for GPU Collective Communication

    • PKU & OneFlow

Model Serving

  • A House United Within Itself: SLO-Awareness for On-Premises Containerized ML Inference Clusters via Faro

    • UIUC & IBM Research

  • SpotHedge: Serving AI Models on Spot Instances

    • UC Berkeley

Deep Learning Compilation

  • SpaceFusion: Advanced Deep Learning Operator Fusion via Space-Mapping Graph

    • SJTU

Resource Management

  • Scheduling

    • Towards VM Rescheduling Optimization Through Deep Reinforcement Learning

      • UC Merced & UC Berkeley & ByteDance

    • Eva: Cost-Efficient Cloud-Based Cluster Scheduling

      • UW-Madison

  • Serverless Computing

    • Serverless Cold Starts and Where to Find Them

      • Huawei

    • SeBS-Flow: Benchmarking Serverless Cloud Function Workflows

      • Karlsruhe Institute of Technology & ETH

    • AlloyStack: A Library Operating System for Serverless Workflow Applications

      • TJU & THU

  • GPU Sharing

    • Improving GPU Sharing Performance through Adaptive Bubbleless Spatial-Temporal Sharing

      • SJTU & Microsoft & Alibaba

    • Multiplexing Dynamic Deep Learning Workloads with SLO-awareness in GPU Clusters

      • University of Macau & SIAT, CAS

Acronyms

  • RLHF: Reinforcement Learning from Human Feedback

  • ML: Machine Learning

Last updated

Was this helpful?