ATC 2025

Meta Info

Homepage: https://www.usenix.org/conference/atc25

Acceptance Rate

15.8% (= 100 / 634)

Papers

Large Language Models (LLMs)

  • KV Cache Management

    • KVCache Cache in the Wild: Characterizing and Optimizing KVCache Cache at a Large Cloud Provider [Paper] [Trace]

      • SJTU IPADS & Alibaba Cloud

      • Key takeaways from the characterization study

        • KV$ reuses are common, but the reuse ratio is smaller than previously reported numbers on synthetic datasets.

        • For each specific request category, the reuse time is predictable based on the historical information.

        • The lifespan of KV$ is ephemeral.

GPU Sharing

  • Efficient Performance-Aware GPU Sharing with Compatibility and Isolation through Kernel Space Interception [Paper]

    • SJTU & Lenovo

    • Krypton

    • The hardware units are divided using MIG, while time slices and device memory are allocated using kernel-space scheduler.

  • GPreempt: GPU Preemptive Scheduling Made General and Efficient [Paper] [Code]

    • THU

    • Implement a timeslice-based yield mechanism to enable context-switch preemption on GPUs.

    • Employ a hint-based pre-preemption technique to overlap the preemption process with the essential data-preparation phase.

  • Colocating ML Inference and Training with Fast GPU Memory Handover [Paper] [Code] [Slides]

    • SJTU IPADS

    • Key insight: training task is elastic and reconfigurable; transfer memory between training and inference by reconfiguring training tasks (i.e., changing batch size)

POSIX Shell

  • The Koala Benchmarks for the Shell: Characterization and Implications [Paper] [Homepage] [Benchmark Suite]

    • Brown University

    • Best Paper Award

    • 14 sets of real-world shell programs from diverse domains ranging from CI/CD and AI/ML to biology and the humanities.

Last updated

Was this helpful?