Diffusion Models
Last updated
Was this helpful?
Last updated
Was this helpful?
FLUX.1 []
Black Forest Labs
Text-to-image generation
Models
FLUX.1-dev:
FLUX.1-schnell:
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis (arXiv:2403.03206) [] []
Stability AI
Stable Diffusion 3 (SD3)
Multimodal Diffusion Transformer (MMDiT)
Models
Stable Diffusion 3 Medium:
Scalable Diffusion Models with Transformers (ICCV 2023) [] [] [] []
UC Berkeley & NYU
DiT
Kuaishou Kolors
Text-to-image generation
Stability AI
Models
LMU Munich & Runway ML
Latent Diffusion Models (LDMs)
Models
Initialized with the weights of the Stable-Diffusion-v1-2 checkpoint and subsequently fine-tuned on 595k steps at resolution 512x512.
Stable Video 4D (SV4D)
Stability AI
Generate 40 frames (5 video frames x 8 camera views) at 576x576 resolution, given 5 reference frames of the same size.
Stability AI
Stable Video Diffusion (SVD)
Text-to-video and image-to-video generation
Models
Generate 14 frames at resolution 576x1024 given a context frame of the same size.
Fine-tuned from the SVD-img2vid.
Generate 25 frames at resolution 576x1024 given a context frame of the same size.
LLM: Large Language Model
Kolors: Effective Training of Diffusion Model for Photorealistic Text-to-Image Synthesis []
Model:
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis (arXiv:2307.01952) []
High-Resolution Image Synthesis with Latent Diffusion Models (CVPR 2022) [] [] []
Stable-Diffusion-v1-5:
Model:
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets (arXiv:2311.15127) [] []