📜
Awesome Papers
  • Introduction
  • Paper List
    • Systems for ML
      • Data Processing
      • Deep Learning Training
      • Resource Scheduler
      • Model Serving
      • Large Language Model (LLM)
      • Diffusion Models
      • Deep Learning Recommendation Model (DLRM)
      • Mixture of Experts (MoE)
      • Hyper-Parameter Tuning (HPO)
      • Reinforcement Learning (RL)
      • Deep Learning Compiler
      • Deep Learning Framework
      • Cloud-Edge Collaboration
    • ML for Systems
    • Artificial Intelligence (AI)
      • Diffusion Models
      • Language Models
      • Deep Learning Recommendation Model (DLRM)
    • Hardware Virtualization
      • GPU Sharing
    • Resource Disaggregation
      • GPU Disaggregation
      • Memory Disaggregation
    • Resource Fragmentation
    • Cloud Computing
      • Sky Computing
      • Serverless Computing
      • Spot Instances
    • Remote Direct Memory Access (RDMA)
    • Research Skills
    • Miscellaneous
  • Reading Notes
    • Conference
      • ICML 2025
      • ATC 2025
      • OSDI 2025
      • HotOS 2025
      • MLSys 2025
      • NSDI 2025
      • ASPLOS 2025
      • EuroSys 2025
      • HPCA 2025
      • PPoPP 2025
      • NeurIPS 2024
      • SoCC 2024
      • HotNets 2024
      • SC 2024
      • SOSP 2024
      • VLDB 2024
      • SIGCOMM 2024
      • ICML 2024
      • ATC 2024
      • OSDI 2024
      • ISCA 2024
      • CVPR 2024
      • MLSys 2024
      • ASPLOS 2024
        • SpotServe: Serving generative large language models on preemptible instances
      • EuroSys 2024
        • Orion: Interference-aware, fine-grained GPU sharing for ML applications
      • NSDI 2024
      • NeurIPS 2023
      • SC 2023
        • Interference-aware multiplexing for deep learning in GPU clusters: A middleware approach
      • SoCC 2023
      • SOSP 2023
        • UGache: A unified GPU cache for embedding-based deep learning
      • SIGCOMM 2023
      • HotChips 2023
      • ICML 2023
      • ATC 2023
        • Accelerating Distributed MoE Training and Inference with Lina
        • SmartMoE: Efficiently Training Sparsely-Activated Models ...
        • Beware of Fragmentation: Scheduling GPU-Sharing Workloads with Fragmentation Gradient Descent
      • OSDI 2023
      • HotOS 2023
      • SIGMOD 2023
      • ISCA 2023
      • MLSys 2023
      • EuroSys 2023
      • NSDI 2023
        • Shepherd: Serving DNNs in the wild
        • Understanding RDMA microarchitecture resources for performance isolation
        • Skyplane: Optimizing transfer cost and throughput using cloud-aware overlays
        • Shockwave: Fair and efficient cluster scheduling for dynamic adaptation in machine learning
      • ASPLOS 2023
        • TPP: Transparent page placement for CXL-enabled tiered-memory
        • EVStore: Storage and caching capabilities for scaling embedding tables in deep recommendation system
        • Lucid: A non-intrusive, scalable and interpretable scheduler for deep learning training jobs
      • SC 2022
      • SoCC 2022
        • ESCHER: Expressive scheduling with ephemeral resources
        • Serving unseen deep learning model with near-optimal configurations: A fast adaptive search approach
      • SIGCOMM 2022
        • Multi-resource interleaving for deep learning training
      • ATC 2022
        • PilotFish: Harvesting Free Cycles of Cloud Gaming with Deep Learning Training
        • Memory Harvesting in Multi-GPU Systems with Hierarchical Unified Virtual Memory
        • Whale: Efficient Giant Model Training over Heterogeneous GPUs
        • DVABatch: Diversity-aware Multi-Entry Multi-Exit Batching for Efficient Processing of DNN Service...
        • Serving Heterogeneous Machine Learning Models on Multi-GPU Servers with Spatio-Temporal Sharing
        • SOTER: Guarding Black-box Inference for General Neural Networks at the Edge
        • Direct access, high-performance memory disaggregation with DirectCXL
      • OSDI 2022
        • Orca: A distributed serving system for transformer-based generative models
        • Microsecond-scale preemption for concurrent GPU-accelerated DNN inferences
        • Looking beyond GPUs for DNN scheduling on multi-tenant clusters
      • IPDPS 2022
        • DGSF: Disaggregated GPUs for serverless functions
      • EuroSys 2022
        • Slashing the disaggregation tax in heterogeneous data centers with FractOS
      • NSDI 2022
      • SoCC 2021
      • ATC 2021
        • Zico: Efficient GPU memory sharing for concurrent DNN training
      • OSDI 2021
        • Pollux: Co-adaptive cluster scheduling for goodput-optimized deep learning
      • SOSP 2021
        • HeMem: Scalable Tiered Memory Management for Big Data Applications and Real NVM
      • EuroSys 2021
        • Take it to the limit: Peak prediction-driven resource overcommitment in datacenters
      • HotOS 2021
        • From cloud computing to sky computing
      • NSDI 2021
      • OSDI 2020
        • A unified architecture for accelerating distributed DNN training in heterogeneous GPU/CPU clusters
        • HiveD: Sharing a GPU cluster for deep learning with guarantees
      • ATC 2020
        • Serverless in the wild: Characterizing and optimizing the serverless workload
      • EuroSys 2020
      • ASPLOS 2020
      • MLSys 2020
      • SoCC 2020
        • Elastic Parameter Server Load Distribution in Deep Learning Clusters
      • HPDC 2020
        • KubeShare: A framework to manage GPUs as first-class and shared resources in container cloud
      • CLUSTER 2019
      • EuroSys 2019
      • NSDI 2019
      • IWQoS 2019
        • Who limits the resource efficiency of my datacenter: An analysis of Alibaba datacenter traces
      • SIGCOMM 2018
        • Revisiting network support for RDMA
      • OSDI 2018
        • Ray: A distributed framework for emerging AI applications
      • EuroSys 2018
        • Medea: Scheduling of long running applications in shared production clusters
      • ISPA/IUCC/BDCloud/SocialCom/SustainCom 2018
        • GaiaGPU: Sharing GPUs in container clouds
      • SoCC 2017
        • SLAQ: Quality-driven scheduling for distributed machine learning
      • ASPLOS 2017
        • Neurosurgeon: Collaborative intelligence between the cloud and mobile edge
      • NSDI 2017
        • Clipper: A low-latency online prediction serving system
      • CLUSTER 2014
        • Evaluating job packing in warehouse-scale computing
    • Journal
      • IEEE Transactions on Cloud Computing
        • 2021
          • Gemini: Enabling multi-tenant GPU sharing based on kernel burst estimation
      • ACM Computing Surveys
        • 2017
          • GPU virtualization and scheduling methods: A comprehensive survey
      • ACM SIGCOMM Computer Communication Review (CCR)
        • 2021
          • Data-driven Networking Research: models for academic collaboration with industry
        • 2007
          • How to Read a Paper
      • Communications of the ACM
        • 2015
          • Why Google stores billions of lines of code in a single repository
    • Miscellaneous
      • arXiv
        • 2024
          • Efficiently programming large language models using SGLang
        • 2023
          • HexGen: Generative inference of foundation model over heterogeneous decentralized environment
          • High-throughput generative inference of large language models with a single GPU
        • 2022
          • DisaggRec: Architecting disaggregated systems for large-scale personalized recommendation
          • A case for disaggregation of ML data processing
          • Singularity: Planet-scale, preemptive and elastic scheduling of AI workloads
          • Aryl: An elastic cluster scheduler for deep learning
        • 2016
          • Wide & deep learning for recommender systems
          • Training deep nets with sublinear memory cost
      • MSR Technical Report
        • 2011
          • Heuristics for vector bin packing
  • About Myself
    • Academic Profile
    • Personal Blog (in Chinese)
Powered by GitBook
On this page
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • Notes

Was this helpful?

Edit on GitHub
  1. Reading Notes

Conference

2025

Conference
When
Where
Remarks

Jul 13-19, 2025

Vancouver Convention Center, Canada

Incoming

Jul 7-9, 2025

Boston, MA, USA

Jul 7-9, 2025

Boston, MA, USA

May 14-16, 2025

Banff, Alberta, Canada

🧐 Incoming

May 12-15, 2025

Santa Clara Convention Center, CA, USA

🧐 Incoming

Apr 28-30, 2025

Philadelphia, PA, USA

🧐😎

Mar 30-Apr 3, 2025

Rotterdam, Netherlands

Mar 30-Apr 3, 2025

Rotterdam, Netherlands

Mar 1-5, 2025

Las Vegas, NV, USA

🧐

Mar 1-5, 2025

Las Vegas, NV, USA

🧐

2024

Conference
When
Where
Remarks

Dec 10-15, 2024

Vancouver Convention Center, Canada

🧐

Nov 22-24, 2024

Seattle, Washington, USA

🧐

Nov 18-19, 2024

Irvine, California, USA

🧐

Nov 17-22, 2024

Atlanta, GA, USA

🧐

Nov 4-6, 2024

Hilton Austin, Texas, USA

🧐

Aug 26-30, 2024

Guangzhou, China

🧐

Aug 4-8, 2024

Sydney, Australia

🧐

Jul 21-27, 2024

Messe Wien Exhibition Congress Center, Vienna, Austria

Jul 10-12, 2024

Santa Clara, CA, USA

Jul 10-12, 2024

Santa Clara, CA, USA

Jun 29-Jul 3, 2024

Buenos Aires, Argentina

🧐

Jun 17-21, 2024

Seattle Convention Center, Seattle, WA, USA

🧐

May 13-16, 2024

Santa Clara Convention Center, USA

🧐

Apr 27-May 1, 2024

Hilton La Jolla Torrey Pines, San Diego, USA

🧐

Apr 23-26, 2024

Athens, Greece

Apr 16-18, 2024

Santa Clara, CA, USA

🧐

HPCA 2024

Mar 2-6, 2024

Edinburgh, Scotland, UK

2023

Conference
When
Where
Remarks

Dec 10-16, 2023

New Orleans, Louisiana, USA

Nov 12-17, 2023

Denver, Colorado, USA

🧐

Oct 30-Nov 1, 2023

Santa Cruz, California, USA

🧐

Oct 23-26, 2023

Koblenz, Germany

🧐

Sep 10-14, 2023

New York, USA

Aug 27-29, 2023

Hybrid, Stanford University

Jul 23-29, 2023

Hawaii Convention Center

Jul 10-12, 2023

Boston, MA, USA

Jul 10-12, 2023

Boston, MA, USA

Jun 22-24, 2023

Providence, Rhode Island, USA

Jun 18-23, 2023

Seattle, WA, USA

Jun 17-21, 2023

Orlando, FL, USA

Jun 4-8, 2023

Southern Florida, USA

May 9-12, 2023

Rome, Italy

Apr 17-19, 2023

Boston, MA, USA

Mar 25-29, 2023

Vancouver, Canada

2022

Conference
When
Where
Remarks

Nov 13-18, 2022

Kay Bailey Hutchison Convention Center Dallas

Nov 7-11, 2022

San Francisco, CA, USA

👨‍💻

Aug 22-26, 2022

Amsterdam, Netherlands

Jul 11-13, 2022

Carlsbad, CA, USA

Jul 11-13, 2022

Carlsbad, CA, USA

May 30-Jun 3, 2022

Virtual

Apr 5-8, 2022

Rennes, France

Apr 4-6, 2022

Renton, WA, USA

2021

Conference
When
Where
Remarks

Nov 1-4, 2021

Virtual

👨‍💻

Oct 25-29, 2021

Virtual

👨‍💻

Jul 14–16, 2021

Virtual

Apr 26-28, 2021

Virtual

Apr 12-14, 2021

Virtual

2020

Conference
When
Where

Nov 4-6 ,2020

Virtual

Jul 15-17, 2020

Virtual

Apr 27-30, 2020

Virtual

Mar 16-20, 2020

Virtual

Mar 2-4, 2020

Austin, TX, USA

2019

Conference
When
Where

Sep 23-26, 2019

Albuquerque, New Mexico, USA

Mar 25-28, 2019

Dresden, Germany

Feb 26-28, 2019

Boston, MA, USA

2018

Conference
When
Where

Aug 20-25, 2018

Budapest, Hungary

2017

Conference
When
Where

Sep 25-27, 2017

Santa Clara, California, USA

Apr 8-12, 2017

Xi'an, China

Mar 27-29, 2017

Boston, MA, USA

Notes

  • 😎: In-person attendance.

  • 👨‍💻: Virtual attendance.

  • 🧐: Papers are organized.

Last updated 9 days ago

Was this helpful?

Incoming; co-located with

Incoming; co-located with

🧐 Co-located with

🧐 Co-located with

🧐 Co-located with

🧐 Co-located with

😎 Co-located with

😎 Co-located with

ICML 2025
ATC 2025
OSDI 2025
OSDI 2025
ATC 2025
HotOS 2025
MLSys 2025
NSDI 2025
ASPLOS 2025
EuroSys 2025
EuroSys 2025
ASPLOS 2025
HPCA 2025
PPoPP 2025
NeurIPS 2024
SoCC 2024
HotNets 2024
SC 2024
SOSP 2024
VLDB 2024
SIGCOMM 2024
ICML 2024
ATC 2024
OSDI 2024
OSDI 2024
ATC 2024
ISCA 2024
CVPR 2024
MLSys 2024
ASPLOS 2024
EuroSys 2024
NSDI 2024
NeurIPS 2023
SC 2023
SoCC 2023
SOSP 2023
SIGCOMM 2023
HotChips 2023
ICML 2023
ATC 2023
OSDI 2023
OSDI 2023
ATC 2023
HotOS 2023
SIGMOD 2023
ISCA 2023
MLSys 2023
EuroSys 2023
NSDI 2023
ASPLOS 2023
SC 2022
SoCC 2022
SIGCOMM 2022
ATC 2022
OSDI 2022
IPDPS 2022
EuroSys 2022
NSDI 2022
SoCC 2021
SOSP 2021
ATC 2021
EuroSys 2021
NSDI 2021
OSDI 2020
ATC 2020
EuroSys 2020
ASPLOS 2020
MLSys 2020
CLUSTER 2019
EuroSys 2019
NSDI 2019
SIGCOMM 2018
SoCC 2017
ASPLOS 2017
NSDI 2017