githubEdit

NSDI 2023

Meta Info

Homepage: https://www.usenix.org/conference/nsdi23arrow-up-right

Paper list: https://www.usenix.org/conference/nsdi23/technical-sessionsarrow-up-right

Accepted Papers

Papers

Large Language Model (LLM)

  • Bamboo: Making Preemptible Instances Resilient for Affordable Training of Large DNNs [Paperarrow-up-right] [Codearrow-up-right]

    • UCLA & CMU & MSR & Princeton

    • Resilient distributed training

Model Serving

  • Shepherd: Serving DNNs in the wild [Paperarrow-up-right] [Personal Notes]

    • UWaterloo & Yale & UC Berkeley

    • Handle the short-term workload unpredictability.

    • Aggregate request streams into moderately-sized groups; leverage preemption and model-specific batching.

RDMA

Last updated