Diffusion Models

Diffusion Model Serving

PatchedServe: A Patch Management Framework for SLO-Optimized Hybrid Resolution Diffusion Serving (arXiv:2501.09253) [arXiv]
- UWaterloo & CMU & Rice
- Serve requests with hybrid resolutions.
FlexCache: Flexible Approximate Cache System for Video Diffusion (arXiv:2501.04012) [arXiv]
- UWaterloo
- Cache for text-to-video diffusion models.
xDiT: an Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism (arXiv:2411.01738) [arXiv] [Code]
- Tencent
- Several parallel approaches for DiTs.
SwiftDiffusion: Efficient Diffusion Model Serving with Add-on Modules (arXiv:2407.02031) [arXiv]
- HKUST & Alibaba
PipeFusion: Displaced Patch Pipeline Parallelism for Inference of Diffusion Transformer Models (arXiv:2405.14430) [arXiv] [Code]
- Tencent & HKU
Cache Me if You Can: Accelerating Diffusion Models through Block Caching (CVPR 2024) [Paper] [Homepage]
- Meta & TUM & MCML & Oxford
CAT-DM: Controllable Accelerated Virtual Try-on with Diffusion Model (CVPR 2024) [Paper] [Code]
- TJU & Tencent
- CAT-DM: Controllable Accelerated virtual Try-on with Diffusion Model
DeepCache: Accelerating Diffusion Models for Free (CVPR 2024) [Paper] [Code]
- NUS
DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models (CVPR 2024) [Paper] [Code]
- MIT & Princeton & Lepton AI & NVIDIA
- Split the model input into multiple patches and assign each patch to a GPU.
Approximate Caching for Efficiently Serving Text-to-Image Diffusion Models (NSDI 2024) [Paper] [Slides]
- Adobe Research & UIUC
- Skip a certain number of denoising steps.

Diffusion Model Training

DiffusionPipe: Training Large Diffusion Models with Efficient Pipelines (MLSys 2024) [Paper] [Slides]
- HKU & AWS & OSU
- Fill the computation of non-trainable model parts into idle periods of the pipeline training of the backbones.

Supporting Add-on Modules

X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model (CVPR 2024) [Paper] [Homepage] [Code]
- NUS & Tencent & FDU

Domain-Specific Accelerator (DSA)

Cambricon-D: Full-Network Differential Acceleration for Diffusion Models (ISCA 2024) [Paper]
- ICT, CAS

Acronyms

DiT: Diffusion Transformer

Last updated 9 months ago

Was this helpful?