Diffusion Models

Diffusion Model Serving

  • PatchedServe: A Patch Management Framework for SLO-Optimized Hybrid Resolution Diffusion Serving (arXiv:2501.09253) [arXiv]

    • UWaterloo & CMU & Rice

    • Serve requests with hybrid resolutions.

  • FlexCache: Flexible Approximate Cache System for Video Diffusion (arXiv:2501.04012) [arXiv]

    • UWaterloo

    • Cache for text-to-video diffusion models.

  • xDiT: an Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism (arXiv:2411.01738) [arXiv] [Code]

    • Tencent

    • Several parallel approaches for DiTs.

  • SwiftDiffusion: Efficient Diffusion Model Serving with Add-on Modules (arXiv:2407.02031) [arXiv]

    • HKUST & Alibaba

  • PipeFusion: Displaced Patch Pipeline Parallelism for Inference of Diffusion Transformer Models (arXiv:2405.14430) [arXiv] [Code]

    • Tencent & HKU

  • Cache Me if You Can: Accelerating Diffusion Models through Block Caching (CVPR 2024) [Paper] [Homepage]

    • Meta & TUM & MCML & Oxford

  • CAT-DM: Controllable Accelerated Virtual Try-on with Diffusion Model (CVPR 2024) [Paper] [Code]

    • TJU & Tencent

    • CAT-DM: Controllable Accelerated virtual Try-on with Diffusion Model

  • DeepCache: Accelerating Diffusion Models for Free (CVPR 2024) [Paper] [Code]

    • NUS

  • DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models (CVPR 2024) [Paper] [Code]

    • MIT & Princeton & Lepton AI & NVIDIA

    • Split the model input into multiple patches and assign each patch to a GPU.

  • Approximate Caching for Efficiently Serving Text-to-Image Diffusion Models (NSDI 2024) [Paper] [Slides]

    • Adobe Research & UIUC

    • Skip a certain number of denoising steps.

Diffusion Model Training

  • DiffusionPipe: Training Large Diffusion Models with Efficient Pipelines (MLSys 2024) [Paper] [Slides]

    • HKU & AWS & OSU

    • Fill the computation of non-trainable model parts into idle periods of the pipeline training of the backbones.

Supporting Add-on Modules

  • X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model (CVPR 2024) [Paper] [Homepage] [Code]

    • NUS & Tencent & FDU

Domain-Specific Accelerator (DSA)

  • Cambricon-D: Full-Network Differential Acceleration for Diffusion Models (ISCA 2024) [Paper]

    • ICT, CAS

Acronyms

  • DiT: Diffusion Transformer

Last updated

Was this helpful?