OSDI 2025
Meta Info
Homepage: https://www.usenix.org/conference/osdi25
Acceptance Rate
14.6% (= 48 / 327)
Papers
Large Language Models (LLMs)
LLM Training
WLB-LLM: Workload-Balanced 4D Parallelism for Large Language Model Training
UCSD
LLM Inference
Fast and Live Model Auto Scaling with O(1) Host Caching
SJTU IPADS
Deep Learning Compilation
KPerfIR: Towards a Open and Compiler-centric Ecosystem for GPU Kernel Performance Tooling on Modern AI Workloads
UCSD
GPU Sharing
Preemptive Scheduling for Diverse XPUs using Multi-level Hardware Model [Paper] [Code] [Slides]
SJTU IPADS
XQueue: An XPU task is abstracted as a sequence of commands executed on a command queue.
Multi-level hardware model
Level-1: Preempt pending commands (block host CPU from launching new commands, no hardware requirements)
Level-2: Preempt in-flight commands (e.g., instruct the μ-controllers to stall command dispatching, leverage command programmability)
Level-3: Preempt running commands
GPU Communication
Enabling Efficient GPU Communication over Multiple NICs with FuseLink [Paper]
HKUST iSING Lab
Integrate high-speed intra-server links as critical extensions of the inter-server network.
Implemented as an independent networking module to replace the default Infiniband networking in NCCL.
Resource Allocation
Decouple and Decompose: Scaling Resource Allocation through a Different Lens
Harvard
Memory Translation
EMT: An OS Framework for New Memory Translation Architectures
UIUC
Vector Search
Quake: Adaptive Indexing for Vector Search
UW-Madison
File Systems
Fast and Synchronous Crash Consistency with Metadata Write-Once File System
HIT-SZ
Databases
Tigon: A Distributed Database for a CXL Pod
UT-Austin
Replicated State Machines (RSMs)
Picsou: Enabling Efficient Cross-Consensus Communication [arXiv]
UC Berkeley
Last updated
Was this helpful?