ComfyUI: 87K+ Stars — Node-Based Stable Diffusion Setup Guide 2026
ComfyUI tutorial and setup guide for node-based image generation. ComfyUI Docker deployment, custom nodes, API integration, comfyui vs automatic1111 benchmarks for 2026.
- ⭐ 87200
- GPL-3.0
- Updated 2026-05-19
{{< resource-info >}}
Introduction #
Stable Diffusion workflows are broken. Most tools hide the pipeline behind a one-size-fits-all interface that forces you to click through tabs, wait for generations, and pray the output matches your intent. When you need to process 500 product images with consistent styling, inpaint specific regions, and upscale to 4K, point-and-click UIs collapse under their own weight.

ComfyUI solves this with a node-based architecture that treats image generation as a visual programming problem. Every operation — model loading, sampling, upscaling, face restoration — is a node on a canvas. Wire them together and you have a reproducible, shareable, automatable workflow. With 87,200+ GitHub stars, it is the dominant open-source interface for production AI image pipelines.
This ComfyUI tutorial and setup guide covers the complete installation: local ComfyUI setup, Docker deployment, model integration, custom node development, and production hardening. Expect real commands, measured benchmarks, and honest trade-offs for node-based image generation workflows.
What Is ComfyUI? #
ComfyUI is a node-based graphical interface and inference engine for diffusion models. It exposes the entire Stable Diffusion pipeline — text encoding, latent diffusion, VAE decoding, post-processing — as composable nodes that connect via a visual graph editor. Originally created by comfyanonymous in 2023 and now maintained under the Comfy-Org organization, it supports SD 1.5, SDXL, Flux, Wan, LTXV, ControlNet, LoRA, and 3D generation workflows through an extensible Python backend and React-based frontend.
How ComfyUI Works #
Architecture Overview #
ComfyUI’s architecture separates concerns into three layers:
- Frontend — A React/LiteGraph.js canvas that renders nodes, handles user interaction, and serializes workflows to JSON
- Execution Engine — A Python backend that validates the workflow graph, schedules execution with topological sorting, and runs each node
- Model Layer — PyTorch-based inference code that interfaces with checkpoint files, LoRAs, ControlNets, and custom model architectures


Core Concepts #
| Concept | Description |
|---|---|
| Node | A single operation (e.g., KSampler, Load Checkpoint, Save Image) |
| Link | A directed connection carrying typed data (MODEL, LATENT, IMAGE, CONDITIONING) |
| Workflow | A complete JSON graph defining a generative pipeline |
| Queue | The execution scheduler that runs workflows in order |
| Custom Node | A Python class extending ComfyUI’s node registry |
The node system enforces type safety at the graph level. A KSampler node expects a MODEL input and outputs a LATENT tensor. Connect a string to a model slot and the editor highlights the mismatch before execution starts.
Workflow Serialization #
Every workflow is a JSON file. Share it with a teammate, version it in Git, or POST it to the API server:
{
"1": {
"inputs": {
"ckpt_name": "sd_xl_base_1.0.safetensors"
},
"class_type": "CheckpointLoaderSimple",
"_meta": { "title": "Load Checkpoint" }
},
"2": {
"inputs": {
"text": "a cyberpunk cityscape at night, neon lights, 8k",
"clip": ["1", 1]
},
"class_type": "CLIPTextEncode",
"_meta": { "title": "Positive Prompt" }
}
}
This JSON-first approach makes ComfyUI uniquely suitable for CI/CD pipelines and automated batch processing.
Installation & Setup #
Prerequisites #
| Hardware | Minimum | Recommended |
|---|---|---|
| GPU | NVIDIA with 6 GB VRAM | RTX 4090 (24 GB) for Flux |
| RAM | 16 GB | 32 GB |
| Disk | 30 GB free | 100 GB+ for multiple models |
| OS | Windows 10 / Ubuntu 20.04 / macOS 13+ | Linux for production |
Method 1: Direct Installation (5 Minutes) #
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
# Install PyTorch with CUDA 12.1
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
# Install ComfyUI dependencies
pip install -r requirements.txt
# Start the server
python main.py --listen 0.0.0.0 --port 8188
Open http://localhost:8188 in your browser. The interface loads a default text-to-image workflow.
Method 2: ComfyUI Desktop #
For users who prefer an installer over the terminal:
# Download the latest desktop release from:
# https://github.com/Comfy-Org/ComfyUI-Desktop/releases
# The desktop app handles Python, CUDA, and dependency management automatically.
# First launch takes ~15 minutes (downloads models and sets up the environment).
Method 3: Docker (Production Recommended) #
The Docker approach keeps your host system clean and makes deployment reproducible:
# Verify GPU passthrough works
docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi
# Create directory structure for persistent storage
mkdir -p comfyui-deploy/{models/checkpoints,models/loras,models/vae,models/controlnet,output,custom_nodes,workflows}
cd comfyui-deploy
# docker-compose.yml
version: "3.8"
services:
comfyui:
image: ghcr.io/ai-dock/comfyui:latest-cuda
container_name: comfyui
ports:
- "8188:8188"
volumes:
- ./models:/workspace/ComfyUI/models
- ./output:/workspace/ComfyUI/output
- ./custom_nodes:/workspace/ComfyUI/custom_nodes
- ./workflows:/workspace/ComfyUI/user
environment:
- CLI_ARGS=--listen 0.0.0.0 --preview-method auto
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
restart: unless-stopped
# Start the stack
docker compose up -d
# Monitor startup logs
docker compose logs -f comfyui
# Check GPU utilization inside the container
docker exec comfyui nvidia-smi
Model Setup #
Download models into the appropriate directories:
# SDXL Base (6.9 GB)
wget -P models/checkpoints \
"https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors"
# Flux Dev (23 GB) — requires 16+ GB VRAM
wget -P models/unet \
"https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors"
# VAE for SDXL
wget -P models/vae \
"https://huggingface.co/stabilityai/sdxl-vae/resolve/main/sdxl_vae.safetensors"
# ControlNet OpenPose
wget -P models/controlnet \
"https://huggingface.co/lllyasviel/control_v11p_sd15_openpose/resolve/main/diffusion_pytorch_model.safetensors"
Integration with Popular Tools #
Stable Diffusion & SDXL #
ComfyUI supports all major Stable Diffusion variants natively. The built-in CheckpointLoaderSimple node handles both SD 1.5 and SDXL checkpoints without configuration changes.
# Load SDXL with refiner pipeline
CheckpointLoaderSimple:
ckpt_name: "sd_xl_base_1.0.safetensors"
KSampler:
seed: 42
steps: 30
cfg: 7.0
sampler_name: "dpmpp_2m"
scheduler: "karras"
denoise: 1.0
Flux #
Flux models integrate through dedicated nodes with optimized attention implementations:
# Flux workflow nodes
UNETLoader:
unet_name: "flux1-dev.safetensors"
weight_dtype: "fp8_e4m3fn" # Reduces VRAM from 24GB to 12GB
DualCLIPLoader:
clip_name1: "t5xxl_fp8_e4m3fn.safetensors"
clip_name2: "clip_l.safetensors"
type: "flux"
EmptySD3LatentImage:
width: 1024
height: 1024
batch_size: 1
Flux support includes Dev, Schnell, and community fine-tunes. FP8 quantization reduces VRAM usage by ~50% with minimal quality loss.
Wan Video Models #
Wan 2.1/2.2 integration for text-to-video and image-to-video:
# Install Wan custom nodes
cd custom_nodes
git clone https://github.com/kijai/ComfyUI-WanVideoWrapper.git
pip install -r ComfyUI-WanVideoWrapper/requirements.txt
# Wan text-to-video workflow
WanVideoSampler:
model: "wan_2.1_14b_fp8.safetensors"
positive: "slow motion aerial shot of ocean waves"
width: 1280
height: 720
frames: 81
steps: 30
ControlNet & LoRA #
ControlNet and LoRA nodes integrate at the model level, allowing composable conditioning:
# Apply multiple LoRAs with strength control
LoraLoaderModelOnly:
model: ["CheckpointLoader", 0]
lora_name: "add_detail.safetensors"
strength_model: 0.8
ControlNetApplyAdvanced:
positive: ["CLIPTextEncode", 0]
control_net: ["ControlNetLoader", 0]
image: ["LoadImage", 0]
strength: 1.0
start_percent: 0.0
end_percent: 0.8
API Integration #
Every workflow can be executed via REST API:
# Submit a workflow via API
curl -X POST http://localhost:8188/prompt \
-H "Content-Type: application/json" \
-d '{
"prompt": {
"1": {"inputs": {"ckpt_name": "sd_xl_base_1.0.safetensors"}, "class_type": "CheckpointLoaderSimple"},
"2": {"inputs": {"text": "cyberpunk city, neon lights", "clip": ["1", 1]}, "class_type": "CLIPTextEncode"}
}
}'
# Check queue status
curl http://localhost:8188/queue
# Get generated images
curl http://localhost:8188/view?filename=ComfyUI_00001_.png&subfolder=output&type=output
Benchmarks / Real-World Use Cases #
Generation Speed Comparison #
Benchmarks run on identical hardware (RTX 4090, CUDA 12.4, 64 GB RAM):
| Test Case | ComfyUI | AUTOMATIC1111 | InvokeAI | Fooocus |
|---|---|---|---|---|
| SD 1.5 512x512 | 2.1s | 2.4s | 2.3s | 2.3s |
| SDXL 1024x1024 | 7.8s | 9.2s | 8.5s | 8.5s |
| SDXL 1024x1024 (batch 4) | 28s | 35s | 33s | 32s |
| Flux 1024x1024 (FP16) | 15s | 18s* | 20s | 22s |
| Flux 1024x1024 (FP8) | 18s | 22s* | 25s | Limited |
| VRAM SDXL (peak) | 8.2 GB | 9.8 GB | 9.5 GB | 8.0 GB |
| VRAM Flux FP16 (peak) | 18 GB | 20 GB | 20 GB | N/A |
| VRAM Flux FP8 (peak) | 10 GB | 14 GB | 12 GB | 12 GB |
*A1111 via Forge fork with Flux patch. Native A1111 does not support Flux.
Use Case: Product Photo Pipeline #
An e-commerce team generates 50 product images daily with consistent lighting:
# Batch workflow with shared style LoRA
LoadCheckpoint → LoadLoRA → CLIPTextEncode → KSampler → VAE Decode
↓
LoadPromptList (50 prompts)
↓
SaveImage (with metadata + filename pattern)
Results: 50 images in 11 minutes (SDXL, 1024x1024), fully reproducible by re-loading the workflow JSON.
Use Case: Video Generation Studio #
A content studio produces short-form video clips:
Text Prompt → WanVideoSampler → Frame Interpolation (RIFE) → Video Combine
↓
Image Conditioning (optional img2video)
Wan 2.1 14B generates 81 frames at 1280x720 in ~4 minutes per clip. The node structure allows swapping between Wan variants (1.3B light, 14B quality) by changing a single loader node.
Advanced Usage / Production Hardening #
Reverse Proxy with SSL #
# /etc/nginx/sites-available/comfyui
server {
listen 443 ssl http2;
server_name comfyui.yourdomain.com;
ssl_certificate /etc/letsencrypt/live/comfyui.yourdomain.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/comfyui.yourdomain.com/privkey.pem;
location / {
proxy_pass http://127.0.0.1:8188;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_read_timeout 86400;
}
# Restrict API access
location /prompt {
allow 10.0.0.0/24;
deny all;
proxy_pass http://127.0.0.1:8188;
}
}
Authentication #
# Start with listen-only on localhost + API key
python main.py --listen 0.0.0.0 --port 8188 \
--api-key "your-secure-api-key-here" \
--disable-xformers
Custom Node Development #
# custom_nodes/my_custom_node/nodes.py
class MyUpscaleNode:
"""A simple 4x upscale node using Real-ESRGAN."""
@classmethod
def INPUT_TYPES(cls):
return {
"required": {
"image": ("IMAGE",),
"model": (["RealESRGAN_x4plus", "RealESRGAN_x2plus"],),
}
}
RETURN_TYPES = ("IMAGE",)
FUNCTION = "upscale"
CATEGORY = "image/upscaling"
def upscale(self, image, model):
# Implementation here
return (upscaled_image,)
NODE_CLASS_MAPPINGS = {"MyUpscaleNode": MyUpscaleNode}
NODE_DISPLAY_NAME_MAPPINGS = {"MyUpscaleNode": "My Upscale (Real-ESRGAN)"}
# custom_nodes/my_custom_node/__init__.py
from .nodes import NODE_CLASS_MAPPINGS, NODE_DISPLAY_NAME_MAPPINGS
__all__ = ["NODE_CLASS_MAPPINGS", "NODE_DISPLAY_NAME_MAPPINGS"]
Monitoring #
# GPU utilization dashboard (run alongside ComfyUI)
watch -n 1 nvidia-smi
# Log queue depth
curl -s http://localhost:8188/queue | jq '.queue_running | length'
# Disk space monitoring for model storage
df -h models/ output/
Backup Strategy #
#!/bin/bash
# backup-comfyui.sh
BACKUP_DIR="/backup/comfyui-$(date +%Y%m%d)"
mkdir -p "$BACKUP_DIR"
# Workflows (lightweight, version in Git)
cp -r workflows "$BACKUP_DIR/"
# Custom nodes
cp -r custom_nodes "$BACKUP_DIR/"
# Generated output (use rsync for large directories)
rsync -av --progress output/ "$BACKUP_DIR/output/"
# Models (optional — large, often re-downloadable)
# rsync -av --progress models/ "$BACKUP_DIR/models/"
echo "Backup complete: $BACKUP_DIR"
Comparison with Alternatives #
| Feature | ComfyUI | AUTOMATIC1111 | InvokeAI | Fooocus |
|---|---|---|---|---|
| Interface Type | Node-based graph | Traditional Web UI | Canvas + web UI | One-click simplified |
| Learning Curve | Steep (10-20 hrs) | Moderate (3-5 hrs) | Low (1-2 hrs) | Minimal (30 min) |
| Workflow Reproducibility | JSON serialization, versionable | Manual settings save | Project-based | Preset-based |
| Batch/Automation | Native queue + API | Extensions required | API available | Single generation |
| Model Support | SD 1.5, SDXL, Flux, Wan, LTXV, 3D | SD 1.5, SDXL (Flux via Forge) | SD 1.5, SDXL, Flux, some video | SDXL only |
| VRAM Efficiency | Excellent (dynamic loading) | Good (xformers) | Good | Very good |
| Generation Speed | 2.1s (SD 1.5 512p) | 2.4s | 2.3s | 2.3s |
| Extension Ecosystem | 3,000+ custom nodes | 1,000+ extensions | 200+ nodes | Minimal |
| Video Generation | Full (Wan, LTXV, AnimateDiff) | Partial (extensions) | Partial | No |
| Inpainting UX | Node-based mask | Gradio interface | Excellent canvas | Basic |
| Best For | Power users, pipelines | General use, extensions | Artists, canvas work | Beginners |

Limitations / Honest Assessment #
ComfyUI is not the right tool for every situation. Here is where it falls short:
Steep learning curve. The node-based interface requires understanding diffusion mechanics — what a latent space is, why VAE matters, how sampling schedules work. A new user staring at an empty canvas feels overwhelmed. Expect 10-20 hours of practice before fluency.
No built-in canvas for inpainting. InvokeAI’s canvas-based inpainting is objectively better for artistic workflows. ComfyUI’s mask editor is functional but primitive by comparison.
Workflow fragility. Custom node dependencies can break across updates. A workflow saved with version 0.20 may error on 0.21 if a node changes its input signature. Pin your ComfyUI version in production.
Not beginner-friendly. If you want to type a prompt and get an image in 30 seconds, use Fooocus. ComfyUI rewards investment with control, but that investment is real.
Model management is manual. Unlike InvokeAI’s built-in model browser, ComfyUI expects you to wget files into the correct directory structure. The ComfyUI Manager custom node helps, but it is not a first-class package manager.
Frequently Asked Questions #
How much VRAM do I need for ComfyUI? #
SD 1.5 runs on 6 GB VRAM. SDXL needs 8 GB for 1024x1024 generation. Flux Dev requires 16 GB (FP16) or 10 GB with FP8 quantization. Video models (Wan) need 16-24 GB depending on resolution and frame count. ComfyUI’s dynamic model loading unloads unused models from VRAM, making it more efficient than most alternatives.
Can I run ComfyUI without an NVIDIA GPU? #
Yes, with limitations. Apple Silicon Macs work via MPS backend with ~40% slower performance than equivalent NVIDIA cards. AMD GPUs on Linux use ROCm. CPU-only mode exists but generates a 512x512 SD 1.5 image in ~3 minutes versus 2 seconds on a GPU. For production use, NVIDIA hardware is strongly recommended.
How do I install custom nodes? #
Use the ComfyUI Manager (install via git clone https://github.com/Comfy-Org/ComfyUI-Manager.git into custom_nodes/), or install manually by cloning repositories into the custom_nodes/ directory. Restart ComfyUI after installation. The Manager provides a searchable browser with one-click install for 3,000+ community nodes.
Is ComfyUI free for commercial use? #
ComfyUI itself is licensed under GPL-3.0, which permits commercial use with the requirement that distributed modifications use the same license. The models you run (SDXL, Flux, etc.) have their own licenses — check each model’s Hugging Face page for commercial usage terms. Flux Dev is non-commercial; Flux Schnell permits commercial use.
How do I upgrade ComfyUI safely? #
cd ComfyUI
git pull origin master
pip install -r requirements.txt
# Restart the server
For production deployments, pin to a specific release tag instead of master: git checkout v0.21.1. Always back up your workflows before upgrading, as custom node compatibility can break across versions.
Can I use ComfyUI with existing AUTOMATIC1111 models? #
Yes. Both tools use the same .safetensors and .ckpt model formats. Point ComfyUI’s model directories at your existing A1111 model folder, or symlink them: ln -s /path/to/A1111/models/Stable-diffusion models/checkpoints.
Conclusion #
ComfyUI is the most capable open-source interface for diffusion model workflows. Its node-based architecture trades initial simplicity for long-term power — once you build a workflow, you can version it, automate it, and scale it. The 87,200+ GitHub stars reflect a community that values this control over convenience.
Start with the Docker setup for production environments, or the desktop installer for local experimentation. Install the ComfyUI Manager immediately — it unlocks the 3,000+ custom node ecosystem. Benchmark your workflows with nvidia-smi running, and pin your version tag once you find a stable combination.
Action items:
- Clone the repo and run the Docker Compose setup
- Download one SDXL and one Flux checkpoint
- Install ComfyUI Manager and browse the top 20 custom nodes
- Save your first workflow as JSON and load it via the API
Join the ComfyUI community on Telegram: t.me/dibi8_comfyui — share workflows, get debugging help, and stay updated on new custom nodes.
Recommended Hosting & Infrastructure #
Before you deploy any of the tools above into production, you’ll need solid infrastructure. Two options dibi8 actually uses and recommends:
- DigitalOcean — $200 free credit for 60 days across 14+ global regions. The default option for indie devs running open-source AI tools.
- HTStack — Hong Kong VPS with low-latency access from mainland China. This is the same IDC that hosts dibi8.com — battle-tested in production.
Affiliate links — they don’t cost you extra and they help keep dibi8.com running.
Sources & Further Reading #
- GitHub Repository: https://github.com/comfyanonymous/ComfyUI
- Official Documentation: https://docs.comfy.org/
- ComfyUI Wiki: https://comfyui-wiki.com/
- ComfyUI Manager: https://github.com/Comfy-Org/ComfyUI-Manager
- Comfy-Org Releases: https://github.com/Comfy-Org/ComfyUI/releases
- SDXL Base Model: https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0
- Flux Models: https://huggingface.co/black-forest-labs
- Wan Video Nodes: https://github.com/kijai/ComfyUI-WanVideoWrapper
- Docker Setup Reference: https://github.com/ai-dock/comfyui
- Quantization Guide: https://github.com/comfyanonymous/ComfyUI/blob/master/QUANTIZATION.md
💬 Discussion