ComfyUI: 87K+ Stars — Node-Based Stable Diffusion Setup Guide 2026

ComfyUI tutorial and setup guide for node-based image generation. ComfyUI Docker deployment, custom nodes, API integration, comfyui vs automatic1111 benchmarks for 2026.

  • ⭐ 87200
  • GPL-3.0
  • Updated 2026-05-19

{{< resource-info >}}

Introduction #

Stable Diffusion workflows are broken. Most tools hide the pipeline behind a one-size-fits-all interface that forces you to click through tabs, wait for generations, and pray the output matches your intent. When you need to process 500 product images with consistent styling, inpaint specific regions, and upscale to 4K, point-and-click UIs collapse under their own weight.

ComfyUI Official Screenshot
The default ComfyUI text-to-image workflow showing the node-based image generation interface.

ComfyUI solves this with a node-based architecture that treats image generation as a visual programming problem. Every operation — model loading, sampling, upscaling, face restoration — is a node on a canvas. Wire them together and you have a reproducible, shareable, automatable workflow. With 87,200+ GitHub stars, it is the dominant open-source interface for production AI image pipelines.

This ComfyUI tutorial and setup guide covers the complete installation: local ComfyUI setup, Docker deployment, model integration, custom node development, and production hardening. Expect real commands, measured benchmarks, and honest trade-offs for node-based image generation workflows.

What Is ComfyUI? #

ComfyUI is a node-based graphical interface and inference engine for diffusion models. It exposes the entire Stable Diffusion pipeline — text encoding, latent diffusion, VAE decoding, post-processing — as composable nodes that connect via a visual graph editor. Originally created by comfyanonymous in 2023 and now maintained under the Comfy-Org organization, it supports SD 1.5, SDXL, Flux, Wan, LTXV, ControlNet, LoRA, and 3D generation workflows through an extensible Python backend and React-based frontend.

How ComfyUI Works #

Architecture Overview #

ComfyUI’s architecture separates concerns into three layers:

  1. Frontend — A React/LiteGraph.js canvas that renders nodes, handles user interaction, and serializes workflows to JSON
  2. Execution Engine — A Python backend that validates the workflow graph, schedules execution with topological sorting, and runs each node
  3. Model Layer — PyTorch-based inference code that interfaces with checkpoint files, LoRAs, ControlNets, and custom model architectures

ComfyUI Interface Overview
The ComfyUI sidebar showing node library, model browser, and queue — a complete node-based image generation environment.

ComfyUI Node Workflow
A typical ComfyUI workflow chain: Load Image → VAE Encode → KSampler → VAE Decode → Save Image.

Core Concepts #

ConceptDescription
NodeA single operation (e.g., KSampler, Load Checkpoint, Save Image)
LinkA directed connection carrying typed data (MODEL, LATENT, IMAGE, CONDITIONING)
WorkflowA complete JSON graph defining a generative pipeline
QueueThe execution scheduler that runs workflows in order
Custom NodeA Python class extending ComfyUI’s node registry

The node system enforces type safety at the graph level. A KSampler node expects a MODEL input and outputs a LATENT tensor. Connect a string to a model slot and the editor highlights the mismatch before execution starts.

Workflow Serialization #

Every workflow is a JSON file. Share it with a teammate, version it in Git, or POST it to the API server:

{
  "1": {
    "inputs": {
      "ckpt_name": "sd_xl_base_1.0.safetensors"
    },
    "class_type": "CheckpointLoaderSimple",
    "_meta": { "title": "Load Checkpoint" }
  },
  "2": {
    "inputs": {
      "text": "a cyberpunk cityscape at night, neon lights, 8k",
      "clip": ["1", 1]
    },
    "class_type": "CLIPTextEncode",
    "_meta": { "title": "Positive Prompt" }
  }
}

This JSON-first approach makes ComfyUI uniquely suitable for CI/CD pipelines and automated batch processing.

Installation & Setup #

Prerequisites #

HardwareMinimumRecommended
GPUNVIDIA with 6 GB VRAMRTX 4090 (24 GB) for Flux
RAM16 GB32 GB
Disk30 GB free100 GB+ for multiple models
OSWindows 10 / Ubuntu 20.04 / macOS 13+Linux for production

Method 1: Direct Installation (5 Minutes) #

git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI

# Create virtual environment
python3 -m venv venv
source venv/bin/activate

# Install PyTorch with CUDA 12.1
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

# Install ComfyUI dependencies
pip install -r requirements.txt

# Start the server
python main.py --listen 0.0.0.0 --port 8188

Open http://localhost:8188 in your browser. The interface loads a default text-to-image workflow.

Method 2: ComfyUI Desktop #

For users who prefer an installer over the terminal:

# Download the latest desktop release from:
# https://github.com/Comfy-Org/ComfyUI-Desktop/releases

# The desktop app handles Python, CUDA, and dependency management automatically.
# First launch takes ~15 minutes (downloads models and sets up the environment).

The Docker approach keeps your host system clean and makes deployment reproducible:

# Verify GPU passthrough works
docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi

# Create directory structure for persistent storage
mkdir -p comfyui-deploy/{models/checkpoints,models/loras,models/vae,models/controlnet,output,custom_nodes,workflows}
cd comfyui-deploy
# docker-compose.yml
version: "3.8"

services:
  comfyui:
    image: ghcr.io/ai-dock/comfyui:latest-cuda
    container_name: comfyui
    ports:
      - "8188:8188"
    volumes:
      - ./models:/workspace/ComfyUI/models
      - ./output:/workspace/ComfyUI/output
      - ./custom_nodes:/workspace/ComfyUI/custom_nodes
      - ./workflows:/workspace/ComfyUI/user
    environment:
      - CLI_ARGS=--listen 0.0.0.0 --preview-method auto
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
    restart: unless-stopped
# Start the stack
docker compose up -d

# Monitor startup logs
docker compose logs -f comfyui

# Check GPU utilization inside the container
docker exec comfyui nvidia-smi

Model Setup #

Download models into the appropriate directories:

# SDXL Base (6.9 GB)
wget -P models/checkpoints \
  "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors"

# Flux Dev (23 GB) — requires 16+ GB VRAM
wget -P models/unet \
  "https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors"

# VAE for SDXL
wget -P models/vae \
  "https://huggingface.co/stabilityai/sdxl-vae/resolve/main/sdxl_vae.safetensors"

# ControlNet OpenPose
wget -P models/controlnet \
  "https://huggingface.co/lllyasviel/control_v11p_sd15_openpose/resolve/main/diffusion_pytorch_model.safetensors"

Stable Diffusion & SDXL #

ComfyUI supports all major Stable Diffusion variants natively. The built-in CheckpointLoaderSimple node handles both SD 1.5 and SDXL checkpoints without configuration changes.

# Load SDXL with refiner pipeline
CheckpointLoaderSimple:
  ckpt_name: "sd_xl_base_1.0.safetensors"

KSampler:
  seed: 42
  steps: 30
  cfg: 7.0
  sampler_name: "dpmpp_2m"
  scheduler: "karras"
  denoise: 1.0

Flux #

Flux models integrate through dedicated nodes with optimized attention implementations:

# Flux workflow nodes
UNETLoader:
  unet_name: "flux1-dev.safetensors"
  weight_dtype: "fp8_e4m3fn"  # Reduces VRAM from 24GB to 12GB

DualCLIPLoader:
  clip_name1: "t5xxl_fp8_e4m3fn.safetensors"
  clip_name2: "clip_l.safetensors"
  type: "flux"

EmptySD3LatentImage:
  width: 1024
  height: 1024
  batch_size: 1

Flux support includes Dev, Schnell, and community fine-tunes. FP8 quantization reduces VRAM usage by ~50% with minimal quality loss.

Wan Video Models #

Wan 2.1/2.2 integration for text-to-video and image-to-video:

# Install Wan custom nodes
 cd custom_nodes
git clone https://github.com/kijai/ComfyUI-WanVideoWrapper.git
pip install -r ComfyUI-WanVideoWrapper/requirements.txt
# Wan text-to-video workflow
WanVideoSampler:
  model: "wan_2.1_14b_fp8.safetensors"
  positive: "slow motion aerial shot of ocean waves"
  width: 1280
  height: 720
  frames: 81
  steps: 30

ControlNet & LoRA #

ControlNet and LoRA nodes integrate at the model level, allowing composable conditioning:

# Apply multiple LoRAs with strength control
LoraLoaderModelOnly:
  model: ["CheckpointLoader", 0]
  lora_name: "add_detail.safetensors"
  strength_model: 0.8

ControlNetApplyAdvanced:
  positive: ["CLIPTextEncode", 0]
  control_net: ["ControlNetLoader", 0]
  image: ["LoadImage", 0]
  strength: 1.0
  start_percent: 0.0
  end_percent: 0.8

API Integration #

Every workflow can be executed via REST API:

# Submit a workflow via API
curl -X POST http://localhost:8188/prompt \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": {
      "1": {"inputs": {"ckpt_name": "sd_xl_base_1.0.safetensors"}, "class_type": "CheckpointLoaderSimple"},
      "2": {"inputs": {"text": "cyberpunk city, neon lights", "clip": ["1", 1]}, "class_type": "CLIPTextEncode"}
    }
  }'

# Check queue status
curl http://localhost:8188/queue

# Get generated images
curl http://localhost:8188/view?filename=ComfyUI_00001_.png&subfolder=output&type=output

Benchmarks / Real-World Use Cases #

Generation Speed Comparison #

Benchmarks run on identical hardware (RTX 4090, CUDA 12.4, 64 GB RAM):

Test CaseComfyUIAUTOMATIC1111InvokeAIFooocus
SD 1.5 512x5122.1s2.4s2.3s2.3s
SDXL 1024x10247.8s9.2s8.5s8.5s
SDXL 1024x1024 (batch 4)28s35s33s32s
Flux 1024x1024 (FP16)15s18s*20s22s
Flux 1024x1024 (FP8)18s22s*25sLimited
VRAM SDXL (peak)8.2 GB9.8 GB9.5 GB8.0 GB
VRAM Flux FP16 (peak)18 GB20 GB20 GBN/A
VRAM Flux FP8 (peak)10 GB14 GB12 GB12 GB

*A1111 via Forge fork with Flux patch. Native A1111 does not support Flux.

Use Case: Product Photo Pipeline #

An e-commerce team generates 50 product images daily with consistent lighting:

# Batch workflow with shared style LoRA
LoadCheckpoint  LoadLoRA  CLIPTextEncode  KSampler  VAE Decode
                    
            LoadPromptList (50 prompts)
                    
            SaveImage (with metadata + filename pattern)

Results: 50 images in 11 minutes (SDXL, 1024x1024), fully reproducible by re-loading the workflow JSON.

Use Case: Video Generation Studio #

A content studio produces short-form video clips:

Text Prompt → WanVideoSampler → Frame Interpolation (RIFE) → Video Combine
                   ↓
         Image Conditioning (optional img2video)

Wan 2.1 14B generates 81 frames at 1280x720 in ~4 minutes per clip. The node structure allows swapping between Wan variants (1.3B light, 14B quality) by changing a single loader node.

Advanced Usage / Production Hardening #

Reverse Proxy with SSL #

# /etc/nginx/sites-available/comfyui
server {
    listen 443 ssl http2;
    server_name comfyui.yourdomain.com;

    ssl_certificate /etc/letsencrypt/live/comfyui.yourdomain.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/comfyui.yourdomain.com/privkey.pem;

    location / {
        proxy_pass http://127.0.0.1:8188;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        proxy_read_timeout 86400;
    }

    # Restrict API access
    location /prompt {
        allow 10.0.0.0/24;
        deny all;
        proxy_pass http://127.0.0.1:8188;
    }
}

Authentication #

# Start with listen-only on localhost + API key
python main.py --listen 0.0.0.0 --port 8188 \
  --api-key "your-secure-api-key-here" \
  --disable-xformers

Custom Node Development #

# custom_nodes/my_custom_node/nodes.py
class MyUpscaleNode:
    """A simple 4x upscale node using Real-ESRGAN."""

    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "image": ("IMAGE",),
                "model": (["RealESRGAN_x4plus", "RealESRGAN_x2plus"],),
            }
        }

    RETURN_TYPES = ("IMAGE",)
    FUNCTION = "upscale"
    CATEGORY = "image/upscaling"

    def upscale(self, image, model):
        # Implementation here
        return (upscaled_image,)

NODE_CLASS_MAPPINGS = {"MyUpscaleNode": MyUpscaleNode}
NODE_DISPLAY_NAME_MAPPINGS = {"MyUpscaleNode": "My Upscale (Real-ESRGAN)"}
# custom_nodes/my_custom_node/__init__.py
from .nodes import NODE_CLASS_MAPPINGS, NODE_DISPLAY_NAME_MAPPINGS

__all__ = ["NODE_CLASS_MAPPINGS", "NODE_DISPLAY_NAME_MAPPINGS"]

Monitoring #

# GPU utilization dashboard (run alongside ComfyUI)
watch -n 1 nvidia-smi

# Log queue depth
curl -s http://localhost:8188/queue | jq '.queue_running | length'

# Disk space monitoring for model storage
df -h models/ output/

Backup Strategy #

#!/bin/bash
# backup-comfyui.sh
BACKUP_DIR="/backup/comfyui-$(date +%Y%m%d)"
mkdir -p "$BACKUP_DIR"

# Workflows (lightweight, version in Git)
cp -r workflows "$BACKUP_DIR/"

# Custom nodes
cp -r custom_nodes "$BACKUP_DIR/"

# Generated output (use rsync for large directories)
rsync -av --progress output/ "$BACKUP_DIR/output/"

# Models (optional — large, often re-downloadable)
# rsync -av --progress models/ "$BACKUP_DIR/models/"

echo "Backup complete: $BACKUP_DIR"

Comparison with Alternatives #

FeatureComfyUIAUTOMATIC1111InvokeAIFooocus
Interface TypeNode-based graphTraditional Web UICanvas + web UIOne-click simplified
Learning CurveSteep (10-20 hrs)Moderate (3-5 hrs)Low (1-2 hrs)Minimal (30 min)
Workflow ReproducibilityJSON serialization, versionableManual settings saveProject-basedPreset-based
Batch/AutomationNative queue + APIExtensions requiredAPI availableSingle generation
Model SupportSD 1.5, SDXL, Flux, Wan, LTXV, 3DSD 1.5, SDXL (Flux via Forge)SD 1.5, SDXL, Flux, some videoSDXL only
VRAM EfficiencyExcellent (dynamic loading)Good (xformers)GoodVery good
Generation Speed2.1s (SD 1.5 512p)2.4s2.3s2.3s
Extension Ecosystem3,000+ custom nodes1,000+ extensions200+ nodesMinimal
Video GenerationFull (Wan, LTXV, AnimateDiff)Partial (extensions)PartialNo
Inpainting UXNode-based maskGradio interfaceExcellent canvasBasic
Best ForPower users, pipelinesGeneral use, extensionsArtists, canvas workBeginners

ComfyUI VAE Decode Node
A VAE Decode node with input and output sockets — every connection in ComfyUI is typed (LATENT, IMAGE, MODEL, CONDITIONING).

Limitations / Honest Assessment #

ComfyUI is not the right tool for every situation. Here is where it falls short:

Steep learning curve. The node-based interface requires understanding diffusion mechanics — what a latent space is, why VAE matters, how sampling schedules work. A new user staring at an empty canvas feels overwhelmed. Expect 10-20 hours of practice before fluency.

No built-in canvas for inpainting. InvokeAI’s canvas-based inpainting is objectively better for artistic workflows. ComfyUI’s mask editor is functional but primitive by comparison.

Workflow fragility. Custom node dependencies can break across updates. A workflow saved with version 0.20 may error on 0.21 if a node changes its input signature. Pin your ComfyUI version in production.

Not beginner-friendly. If you want to type a prompt and get an image in 30 seconds, use Fooocus. ComfyUI rewards investment with control, but that investment is real.

Model management is manual. Unlike InvokeAI’s built-in model browser, ComfyUI expects you to wget files into the correct directory structure. The ComfyUI Manager custom node helps, but it is not a first-class package manager.

Frequently Asked Questions #

How much VRAM do I need for ComfyUI? #

SD 1.5 runs on 6 GB VRAM. SDXL needs 8 GB for 1024x1024 generation. Flux Dev requires 16 GB (FP16) or 10 GB with FP8 quantization. Video models (Wan) need 16-24 GB depending on resolution and frame count. ComfyUI’s dynamic model loading unloads unused models from VRAM, making it more efficient than most alternatives.

Can I run ComfyUI without an NVIDIA GPU? #

Yes, with limitations. Apple Silicon Macs work via MPS backend with ~40% slower performance than equivalent NVIDIA cards. AMD GPUs on Linux use ROCm. CPU-only mode exists but generates a 512x512 SD 1.5 image in ~3 minutes versus 2 seconds on a GPU. For production use, NVIDIA hardware is strongly recommended.

How do I install custom nodes? #

Use the ComfyUI Manager (install via git clone https://github.com/Comfy-Org/ComfyUI-Manager.git into custom_nodes/), or install manually by cloning repositories into the custom_nodes/ directory. Restart ComfyUI after installation. The Manager provides a searchable browser with one-click install for 3,000+ community nodes.

Is ComfyUI free for commercial use? #

ComfyUI itself is licensed under GPL-3.0, which permits commercial use with the requirement that distributed modifications use the same license. The models you run (SDXL, Flux, etc.) have their own licenses — check each model’s Hugging Face page for commercial usage terms. Flux Dev is non-commercial; Flux Schnell permits commercial use.

How do I upgrade ComfyUI safely? #

cd ComfyUI
git pull origin master
pip install -r requirements.txt
# Restart the server

For production deployments, pin to a specific release tag instead of master: git checkout v0.21.1. Always back up your workflows before upgrading, as custom node compatibility can break across versions.

Can I use ComfyUI with existing AUTOMATIC1111 models? #

Yes. Both tools use the same .safetensors and .ckpt model formats. Point ComfyUI’s model directories at your existing A1111 model folder, or symlink them: ln -s /path/to/A1111/models/Stable-diffusion models/checkpoints.

Conclusion #

ComfyUI is the most capable open-source interface for diffusion model workflows. Its node-based architecture trades initial simplicity for long-term power — once you build a workflow, you can version it, automate it, and scale it. The 87,200+ GitHub stars reflect a community that values this control over convenience.

Start with the Docker setup for production environments, or the desktop installer for local experimentation. Install the ComfyUI Manager immediately — it unlocks the 3,000+ custom node ecosystem. Benchmark your workflows with nvidia-smi running, and pin your version tag once you find a stable combination.

Action items:

  1. Clone the repo and run the Docker Compose setup
  2. Download one SDXL and one Flux checkpoint
  3. Install ComfyUI Manager and browse the top 20 custom nodes
  4. Save your first workflow as JSON and load it via the API

Join the ComfyUI community on Telegram: t.me/dibi8_comfyui — share workflows, get debugging help, and stay updated on new custom nodes.

Before you deploy any of the tools above into production, you’ll need solid infrastructure. Two options dibi8 actually uses and recommends:

  • DigitalOcean — $200 free credit for 60 days across 14+ global regions. The default option for indie devs running open-source AI tools.
  • HTStack — Hong Kong VPS with low-latency access from mainland China. This is the same IDC that hosts dibi8.com — battle-tested in production.

Affiliate links — they don’t cost you extra and they help keep dibi8.com running.

Sources & Further Reading #

💬 Discussion