EuroSys 2025

Meta Info

Overall: 12.2% (= 85 / 696)
- Total: 85 (= 44 + 41)
- 11 revised papers from EuroSys'25 Fall
Fall: 8.2% (= 30 / 367)
- 14 revised papers from EuroSys'25 Spring
- Total: 44 (= 30 + 14)
Spring: 9.7% (= 32 / 329)
- 9 revised papers from EuroSys'24 Fall
- Total: 41 (= 32 + 9)

LLM Training
- Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization
  - UofT
- MEPipe: Democratizing LLM Training with Memory-Efficient Slice-Level Pipeline Scheduling on Cost-Effective Accelerators
  - THU & Zhipu AI
LLM Inference
- Fast State Restoration in LLM Serving with HCache
  - THU
- Stateful Large Language Model Serving with Pensieve
  - NYU
- CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge Fusion
  - CUHK-Shenzhen & UChicago & Stanford
  - Best Paper Award (Spring)
- T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge
  - USTC & MSRA
- DeltaZip: Efficient Serving of Multiple Full-Model-Tuned LLMs
  - ETH & MIT
- SpInfer: Leveraging Low-Level Sparsity for Efficient Large Language Model Inference on GPUs
  - HKUST-GZ
  - Best Paper Award (Fall)
LLM Fine-Tuning
- HybridFlow: A Flexible and Efficient RLHF Framework
  - HKU & ByteDance
Mixture-of-Experts (MoEs)
- Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores
  - SJTU

JABAS: Joint Adaptive Batching and Automatic Scaling for DNN Training on Heterogeneous GPUs
- UNIST & Samsung
FlowCheck: Decoupling Checkpointing and Training of Large-Scale Models
- SJTU & Alibaba Cloud
Comprehensive Deadlock Prevention for GPU Collective Communication
- PKU & OneFlow

A House United Within Itself: SLO-Awareness for On-Premises Containerized ML Inference Clusters via Faro
- UIUC & IBM Research
SkyServe: Serving AI Models across Regions and Clouds with Spot Instances [Paper] [Code] [arXiv]
- UC Berkeley
- Manage a mixture of spot and on-demand replicas across regions and clouds.
- Improve availability, reduce correlated preemptions, overprovision cheap spot replicas.
- Baselines: AWS Auto-scaling Group (ASG), MArk [ATC'19], AWS spot node pool (AWSSpot), SpotServe

SpaceFusion: Advanced Deep Learning Operator Fusion via Space-Mapping Graph
- SJTU

Scheduling
- Towards VM Rescheduling Optimization Through Deep Reinforcement Learning
  - UC Merced & UC Berkeley & ByteDance
- Eva: Cost-Efficient Cloud-Based Cluster Scheduling
  - UW-Madison
Serverless Computing
- Serverless Cold Starts and Where to Find Them
  - Huawei
- SeBS-Flow: Benchmarking Serverless Cloud Function Workflows
  - Karlsruhe Institute of Technology & ETH
- AlloyStack: A Library Operating System for Serverless Workflow Applications
  - TJU & THU
GPU Sharing
- Improving GPU Sharing Performance through Adaptive Bubbleless Spatial-Temporal Sharing
  - SJTU & Microsoft & Alibaba
- Multiplexing Dynamic Deep Learning Workloads with SLO-awareness in GPU Clusters
  - University of Macau & SIAT, CAS

Last updated 2 months ago

Was this helpful?