What is Pixelle-Video?
Pixelle-Video is an open-source AI-powered automatic short video generation engine. Simply input a topic, and it automatically completes the entire video production pipeline:
- ✍️ AI Script Writing — Generates video narration based on your topic
- 🎨 AI Image/Video Generation — Creates matching visuals for every scene
- 🗣️ AI Voice Synthesis — Converts script to natural speech using TTS
- 🎵 Background Music — Adds BGM to enhance atmosphere
- 🎬 One-Click Video Assembly — Renders final video automatically
Zero门槛, zero editing experience — video creation becomes as simple as typing one sentence!
🔗 GitHub: https://github.com/AIDC-AI/Pixelle-Video
Key Features
| Feature | Description |
|---|---|
| Fully Automatic | Input topic → get complete video |
| AI Smart Script | AI writes narration, no manual scripting needed |
| AI Image Generation | Every sentence gets a matching AI illustration |
| AI Video Generation | Supports WAN 2.1 and other video models for dynamic content |
| Multi TTS Support | Edge-TTS, Index-TTS, and more voice synthesis options |
| Background Music | Built-in BGM support for better atmosphere |
| Visual Templates | Multiple templates for unique video styles |
| Flexible Sizes | Portrait, landscape, and custom video dimensions |
| Multiple AI Models | GPT, Tongyi Qianwen, DeepSeek, Ollama support |
| ComfyUI Architecture | Modular design, customizable workflows |
Video Generation Pipeline
Pixelle-Video uses a modular design with a clear workflow:
Text Input → Script Generation → Image Planning → Frame Processing → Video Synthesis
Each stage supports flexible customization — choose different AI models, audio engines, visual styles to meet personalized creation needs.
Extended Modules
Beyond basic video generation, Pixelle-Video offers powerful extension modules:
👤 Digital Human Avatar
Upload a photo and generate a talking-head video with lip-sync. Supports multiple languages including Korean, Chinese, and English.
🖼️ Image-to-Video
Transform static images into dynamic videos using AI video generation models.
💃 Motion Transfer
Upload a reference video and image to transfer motions — like making a photo dance following video movements.
Supported AI Models
LLM (Script Generation)
- OpenAI GPT-4o / GPT-4o-mini
- Alibaba Tongyi Qianwen
- DeepSeek V3 / R1
- Ollama (local deployment)
- Custom API endpoints
Image Generation
- FLUX (via ComfyUI)
- Stable Diffusion
- Qwen Image Generation
- RunningHub cloud service
- Nano Banana model
TTS (Voice Synthesis)
- Edge-TTS (free, multi-language)
- Index-TTS (voice cloning)
- ChatTTS
- Custom ComfyUI TTS workflows
Quick Start
1. Clone Repository
git clone https://github.com/AIDC-AI/Pixelle-Video.git
cd Pixelle-Video
2. Install Dependencies
pip install -r requirements.txt
3. Configure API Keys
Edit config.json with your API keys:
{
"llm": {
"api_key": "your-api-key",
"base_url": "https://api.openai.com/v1",
"model": "gpt-4o"
},
"image": {
"comfyui_url": "http://127.0.0.1:8188"
}
}
4. Launch Web UI
python webui.py
Open http://localhost:7860 in your browser.
5. Generate Your First Video
- Enter a topic like “Why reading habits matter”
- Select your preferred TTS voice
- Choose a visual template
- Click “Generate Video”
- Wait 2-5 minutes for the complete video
Use Cases
| Scenario | Example Topic |
|---|---|
| Knowledge Sharing | “10 Python tricks beginners should know” |
| Product Review | “iPhone 16 vs Samsung S24 comparison” |
| Storytelling | “The journey of a startup founder” |
| Educational Content | “How does blockchain work?” |
| News Commentary | “AI trends in 2026” |
| Book/Movie Review | “Lessons from ‘Atomic Habits’” |
Video Style Examples
Pixelle-Video supports multiple video styles:
- 🌄 Documentary Style — Travel, nature, human stories
- 🔍 Cultural Analysis — Deep dives into trends and phenomena
- 🔭 Science & Philosophy — Complex concepts made simple
- 🌱 Personal Growth — Self-improvement, productivity
- 🧠 Deep Thinking — Psychology, philosophy, reflection
- 🏯 History & Culture — Ancient wisdom, historical events
- ☀️ Emotional — Heartwarming stories, inspiration
- 📜 Fiction Commentary — Novel reviews, character analysis
- 🧬 Health & Wellness — Medical tips, wellness advice
Technical Architecture
Pixelle-Video is built on ComfyUI architecture:
- Modular Workflows — Each component (LLM, TTS, image gen) is a separate node
- Customizable Pipeline — Swap any model or service easily
- API-First Design — All capabilities exposed via REST API
- Web UI — Gradio-based interface for easy use
- Batch Processing — Generate multiple videos simultaneously
Performance & Cost
| Option | Cost | Speed | Quality |
|---|---|---|---|
| Local Deployment | Free (GPU required) | Fast | High |
| RunningHub Cloud | Pay-per-use | Instant | High |
| Mixed Mode | Flexible | Balanced | High |
Recommended setup for beginners:
- LLM: DeepSeek API (cheap, good quality)
- Image: RunningHub (no local GPU needed)
- TTS: Edge-TTS (free, multi-language)
Comparison with Other Tools
| Feature | Pixelle-Video | HeyGen | Synthesia | Pictory |
|---|---|---|---|---|
| Open Source | ✅ | ❌ | ❌ | ❌ |
| Free Tier | ✅ | Limited | Limited | Limited |
| Local Deployment | ✅ | ❌ | ❌ | ❌ |
| Custom Models | ✅ | ❌ | ❌ | ❌ |
| ComfyUI Integration | ✅ | ❌ | ❌ | ❌ |
| Voice Cloning | ✅ | ✅ | ✅ | ❌ |
| Digital Human | ✅ | ✅ | ✅ | ❌ |
| Motion Transfer | ✅ | ❌ | ❌ | ❌ |
Tips for Best Results
- Topic Specificity — More specific topics yield better scripts
- Template Selection — Match template to content style
- Prompt Prefix — Use English prompt prefixes for consistent image style
- Voice Preview — Always preview TTS before generating full video
- Batch Generation — Generate 3-5 variants and pick the best
Related Articles
- Free Claude Code: Use Claude Code CLI for Free with Any AI Provider — Free AI coding assistant
- Agent Reach: Give Your AI Agent Internet Superpowers — AI agent with internet access
- Code Vault — 7 Open-Source Crypto Radar & Trading Tools — Python automation tools
Conclusion
Pixelle-Video democratizes video creation by combining LLM, image generation, TTS, and video editing into a single automated pipeline. Whether you’re a content creator, educator, marketer, or developer, this tool can save hours of video production time.
The ComfyUI-based architecture means it’s not just a black-box tool — you can customize every component, swap models, and build your own video generation workflows.
Best for: Content creators, educators, marketers, developers who need quick video production
GitHub: https://github.com/AIDC-AI/Pixelle-Video
Last updated: 2026-05-06