Pixelle-Video Review: AI Auto Short Video Generator

Pixelle-Video is an AI-powered automatic short video engine. Input a topic and get a complete video with script, AI images, voiceover, and BGM.

  • Go
  • Python
  • MIT
  • Updated 2026-05-15

{</* resource-info */>}

Pixelle-Video web UI showing script, TTS, scene template, generated output
Source: github.com/AIDC-AI/Pixelle-Video — official Web UI

What is Pixelle-Video? #

Pixelle-Video is an open-source AI-powered automatic short video generation engine. Simply input a topic, and it automatically completes the entire video production pipeline:

  • ✍️ AI Script Writing — Generates video narration based on your topic
  • 🎨 AI Image/Video Generation — Creates matching visuals for every scene
  • 🗣️ AI Voice Synthesis — Converts script to natural speech using TTS
  • 🎵 Background Music — Adds BGM to enhance atmosphere
  • 🎬 One-Click Video Assembly — Renders final video automatically

Zero门槛, zero editing experience — video creation becomes as simple as typing one sentence!

🔗 GitHub: https://github.com/AIDC-AI/Pixelle-Video


Key Features #

Feature Description
Fully Automatic Input topic → get complete video
AI Smart Script AI writes narration, no manual scripting needed
AI Image Generation Every sentence gets a matching AI illustration
AI Video Generation Supports WAN 2.1 and other video models for dynamic content
Multi TTS Support Edge-TTS, Index-TTS, and more voice synthesis options
Background Music Built-in BGM support for better atmosphere
Visual Templates Multiple templates for unique video styles
Flexible Sizes Portrait, landscape, and custom video dimensions
Multiple AI Models GPT, Tongyi Qianwen, DeepSeek, Ollama support
ComfyUI Architecture Modular design, customizable workflows

Video Generation Pipeline #

Pixelle-Video uses a modular design with a clear workflow:

Text Input → Script Generation → Image Planning → Frame Processing → Video Synthesis

Each stage supports flexible customization — choose different AI models, audio engines, visual styles to meet personalized creation needs.


Extended Modules #

Beyond basic video generation, Pixelle-Video offers powerful extension modules:

👤 Digital Human Avatar #

Upload a photo and generate a talking-head video with lip-sync. Supports multiple languages including Korean, Chinese, and English.

🖼️ Image-to-Video #

Transform static images into dynamic videos using AI video generation models.

💃 Motion Transfer #

Upload a reference video and image to transfer motions — like making a photo dance following video movements.


Supported AI Models #

LLM (Script Generation) #

  • OpenAI GPT-4o / GPT-4o-mini
  • Alibaba Tongyi Qianwen
  • DeepSeek V3 / R1
  • Ollama (local deployment)
  • Custom API endpoints

Image Generation #

  • FLUX (via ComfyUI)
  • Stable Diffusion
  • Qwen Image Generation
  • RunningHub cloud service
  • Nano Banana model

TTS (Voice Synthesis) #

  • Edge-TTS (free, multi-language)
  • Index-TTS (voice cloning)
  • ChatTTS
  • Custom ComfyUI TTS workflows

Quick Start #

1. Clone Repository #

git clone https://github.com/AIDC-AI/Pixelle-Video.git
cd Pixelle-Video

2. Install Dependencies #

pip install -r requirements.txt

3. Configure API Keys #

Edit config.json with your API keys:

{
  "llm": {
    "api_key": "your-api-key",
    "base_url": "https://api.openai.com/v1",
    "model": "gpt-4o"
  },
  "image": {
    "comfyui_url": "http://127.0.0.1:8188"
  }
}

4. Launch Web UI #

python webui.py

Open http://localhost:7860 in your browser.

5. Generate Your First Video #

  1. Enter a topic like “Why reading habits matter”
  2. Select your preferred TTS voice
  3. Choose a visual template
  4. Click “Generate Video”
  5. Wait 2-5 minutes for the complete video

Use Cases #

Scenario Example Topic
Knowledge Sharing “10 Python tricks beginners should know”
Product Review “iPhone 16 vs Samsung S24 comparison”
Storytelling “The journey of a startup founder”
Educational Content “How does blockchain work?”
News Commentary “AI trends in 2026”
Book/Movie Review “Lessons from ‘Atomic Habits’”

Video Style Examples #

Pixelle-Video supports multiple video styles:

  • 🌄 Documentary Style — Travel, nature, human stories
  • 🔍 Cultural Analysis — Deep dives into trends and phenomena
  • 🔭 Science & Philosophy — Complex concepts made simple
  • 🌱 Personal Growth — Self-improvement, productivity
  • 🧠 Deep Thinking — Psychology, philosophy, reflection
  • 🏯 History & Culture — Ancient wisdom, historical events
  • ☀️ Emotional — Heartwarming stories, inspiration
  • 📜 Fiction Commentary — Novel reviews, character analysis
  • 🧬 Health & Wellness — Medical tips, wellness advice

Technical Architecture #

Pixelle-Video is built on ComfyUI architecture:

  • Modular Workflows — Each component (LLM, TTS, image gen) is a separate node
  • Customizable Pipeline — Swap any model or service easily
  • API-First Design — All capabilities exposed via REST API
  • Web UI — Gradio-based interface for easy use
  • Batch Processing — Generate multiple videos simultaneously

Performance & Cost #

Option Cost Speed Quality
Local Deployment Free (GPU required) Fast High
RunningHub Cloud Pay-per-use Instant High
Mixed Mode Flexible Balanced High

Recommended setup for beginners:

  • LLM: DeepSeek API (cheap, good quality)
  • Image: RunningHub (no local GPU needed)
  • TTS: Edge-TTS (free, multi-language)

Comparison with Other Tools #

Feature Pixelle-Video HeyGen Synthesia Pictory
Open Source
Free Tier Limited Limited Limited
Local Deployment
Custom Models
ComfyUI Integration
Voice Cloning
Digital Human
Motion Transfer

Tips for Best Results #

  1. Topic Specificity — More specific topics yield better scripts
  2. Template Selection — Match template to content style
  3. Prompt Prefix — Use English prompt prefixes for consistent image style
  4. Voice Preview — Always preview TTS before generating full video
  5. Batch Generation — Generate 3-5 variants and pick the best


Conclusion #

Pixelle-Video democratizes video creation by combining LLM, image generation, TTS, and video editing into a single automated pipeline. Whether you’re a content creator, educator, marketer, or developer, this tool can save hours of video production time.

The ComfyUI-based architecture means it’s not just a black-box tool — you can customize every component, swap models, and build your own video generation workflows.

Best for: Content creators, educators, marketers, developers who need quick video production

GitHub: https://github.com/AIDC-AI/Pixelle-Video


For developers building or deploying open-source AI tools, we recommend:

  • DigitalOcean — $200 free credit for new users, 14+ global regions, one-click GPU/CPU droplets ideal for AI workloads.
  • Shiyunapi Claude API — Anthropic Claude / OpenAI / DeepSeek API proxy. Most AI tools above (chatbots, code gen, translation, search, etc) need an LLM API key — this proxy delivers stable access to top models at ~30% of official pricing.

Affiliate link — supports dibi8.com at no cost to you.

Last updated: 2026-05-06

References & Sources #

💬 Discussion