EuroSys 2025
Meta Info
Homepage: https://2025.eurosys.org
Paper list: https://2025.eurosys.org/accepted-papers.html
Acceptance Rate
Fall: 8.2% (= 30 / 367)
Spring: 10.5% (= 42 / ?)
Papers
Large Language Models (LLMs)
LLM Training
Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization
UofT
MEPipe: Democratizing LLM Training with Memory-Efficient Slice-Level Pipeline Scheduling on Cost-Effective Accelerators
THU & Zhipu AI
LLM Inference
Fast State Restoration in LLM Serving with HCache
THU
Stateful Large Language Model Serving with Pensieve
NYU
CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge Fusion
CUHK-Shenzhen & UChicago & Stanford
T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge
USTC & MSRA
DeltaZip: Efficient Serving of Multiple Full-Model-Tuned LLMs
ETH & MIT
SpInfer: Leveraging Low-Level Sparsity for Efficient Large Language Model Inference on GPUs
HKUST-GZ
LLM Fine-Tuning
HybridFlow: A Flexible and Efficient RLHF Framework
HKU & ByteDance
Mixture-of-Experts (MoEs)
Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores
SJTU
Distributed Training
JABAS: Joint Adaptive Batching and Automatic Scaling for DNN Training on Heterogeneous GPUs
UNIST & Samsung
FlowCheck: Decoupling Checkpointing and Training of Large-Scale Models
SJTU & Alibaba Cloud
Comprehensive Deadlock Prevention for GPU Collective Communication
PKU & OneFlow
Model Serving
A House United Within Itself: SLO-Awareness for On-Premises Containerized ML Inference Clusters via Faro
UIUC & IBM Research
SpotHedge: Serving AI Models on Spot Instances
UC Berkeley
Deep Learning Compilation
SpaceFusion: Advanced Deep Learning Operator Fusion via Space-Mapping Graph
SJTU
Resource Management
Scheduling
Towards VM Rescheduling Optimization Through Deep Reinforcement Learning
UC Merced & UC Berkeley & ByteDance
Eva: Cost-Efficient Cloud-Based Cluster Scheduling
UW-Madison
Serverless Computing
Serverless Cold Starts and Where to Find Them
Huawei
SeBS-Flow: Benchmarking Serverless Cloud Function Workflows
Karlsruhe Institute of Technology & ETH
AlloyStack: A Library Operating System for Serverless Workflow Applications
TJU & THU
GPU Sharing
Improving GPU Sharing Performance through Adaptive Bubbleless Spatial-Temporal Sharing
SJTU & Microsoft & Alibaba
Multiplexing Dynamic Deep Learning Workloads with SLO-awareness in GPU Clusters
University of Macau & SIAT, CAS
Acronyms
RLHF: Reinforcement Learning from Human Feedback
ML: Machine Learning
Last updated
Was this helpful?