What are the minimum hardware requirements to run RAGFlow?

The minimum is 4 CPU cores (x86_64), 16 GB RAM, and 50 GB SSD, with Docker 24.0.0+ and Docker Compose v2.26.1+. For production serving 50+ users, use 8+ vCPU, 32+ GB RAM, and 200+ GB NVMe; add a GPU with 8+ GB VRAM if parsing large scanned PDFs with OCR.

Can RAGFlow run entirely with local LLMs for privacy?

Yes. RAGFlow integrates with Ollama, vLLM, Xinference, and LocalAI for fully local inference. Configure the provider's base URL in service_conf.yaml.template, and for embeddings pull a model like nomic-embed-text through Ollama and set it in the web UI under Model Providers. When self-hosted, documents stay on your infrastructure and are not sent to external services unless you configure a cloud LLM.

What backend services does RAGFlow need, and is it lightweight?

RAGFlow is not a single-binary tool. It requires multiple backend services: Elasticsearch or Infinity for vector and full-text indexing, MinIO for object storage, MySQL for metadata, and Redis for caching and task queuing. Because of this, it needs at least 16 GB RAM; for simple side projects, lighter alternatives like LightRAG may be more appropriate.

RAGFlow: Deploy a Production-Ready RAG Engine with 80K+ Stars

PageIndex：29K⭐Vectorless RAG System • JuiceFS (14K⭐): The Distributed POSIX File System That Turns

Introduction #

Most RAG pipelines fail in production because they treat PDFs and PowerPoints as plain text. Tables get mangled, image captions disappear, and scanned documents become unreadable. The result is a retrieval system that returns garbage context to your LLM — and hallucinated answers follow. RAGFlow, an open-source RAG engine with over 80,000 GitHub stars, was built to solve exactly this. Its DeepDoc engine understands document layouts, and its built-in agent framework lets you build autonomous knowledge workers, not just chatbots. In this guide, you will deploy RAGFlow on a production server, configure document ingestion, tune retrieval quality, and integrate it with your existing LLM stack.

What Is RAGFlow? #

RAGFlow is an open-source retrieval-augmented generation engine that combines deep document understanding with LLM-powered agents to deliver truthful, cited answers from complex enterprise documents. Unlike generic RAG libraries that rely on simple text splitting, RAGFlow analyzes document structure — tables, images, headers, and scanned pages — to preserve semantic meaning during ingestion. It ships as a Docker-based platform with a web UI, REST API, and agent builder, making it suitable for both developers building pipelines and non-technical users managing knowledge bases.

How RAGFlow Works #

RAGFlow’s architecture follows a modular pipeline design with six core stages:

1. Document Ingestion (DeepDoc) #

Documents enter RAGFlow through the DeepDoc parsing engine. DeepDoc performs layout analysis on PDFs, Word files, Excel sheets, PowerPoint slides, images, and scanned copies. It identifies tables, figures, headers, paragraphs, and text blocks using a vision-based document layout model. This stage also supports external parsers like MinerU and Docling for specialized formats.

2. Knowledge Extraction and Chunking #

After parsing, RAGFlow applies template-based chunking strategies. You can choose from multiple chunking modes — naive, manual, Q&A, table, paper, book, laws, presentation, picture, and one — depending on your document type. Each chunk preserves its structural context, and RAGFlow optionally extracts keywords and generates related questions to improve retrieval recall.

3. Indexing (Hybrid Search Backend) #

Chunks are indexed into either Elasticsearch (default) or Infinity for hybrid search. Both full-text search and dense vector search are supported. The system computes embeddings using configurable embedding models and stores vectors alongside inverted indices for keyword retrieval.

4. Retrieval and Re-ranking #

When a query arrives, RAGFlow performs multi-channel retrieval: keyword search, vector similarity search, and knowledge graph traversal (if GraphRAG is enabled). Results are fused and re-ranked using a cross-encoder re-ranker before being passed to the LLM context window.

5. Generation with Citations #

RAGFlow constructs a prompt that includes retrieved chunks with traceable citations. The LLM generates an answer grounded in the retrieved context, and RAGFlow displays the source chunks alongside the response so users can verify every claim.

6. Agent Execution (Optional) #

Beyond simple question answering, RAGFlow’s agent framework supports multi-step workflows with memory, tool calling, MCP (Model Context Protocol) integration, and code execution in sandboxed environments. Agents can browse the web, query databases, and chain multiple retrieval operations.

Infrastructure Stack #

Service	Purpose	Default Backend
Vector + Full-Text Store	Document indexing and search	Elasticsearch or Infinity
Object Storage	File storage for uploaded documents	MinIO
Metadata Database	User data, dataset configs, chat history	MySQL
Cache & Queue	Task queuing and session caching	Redis
Document Parser	Layout analysis and OCR	DeepDoc (built-in)
Frontend	Web UI for dataset management	React + TypeScript
Backend API	Core orchestration logic	Python + Go

Installation & Setup #

Hardware Requirements #

Resource	Minimum	Recommended for Production
CPU	4 cores (x86_64)	8+ cores
RAM	16 GB	32+ GB
Disk	50 GB SSD	200+ GB NVMe
Docker	24.0.0+	Latest stable
Docker Compose	v2.26.1+	Latest stable

Note: RAGFlow officially supports x86_64 platforms. ARM64 is community-tested but requires building your own Docker image.

Pre-Deployment: System Tuning #

Before starting RAGFlow, ensure your kernel parameters are tuned for Elasticsearch:

# Check current vm.max_map_count
sysctl vm.max_map_count

# Set to at least 262144 (required by Elasticsearch)
sudo sysctl -w vm.max_map_count=262144

# Persist across reboots
echo "vm.max_map_count=262144" | sudo tee -a /etc/sysctl.conf

Step 1: Clone the Repository #

git clone https://github.com/infiniflow/ragflow.git
cd ragflow/docker
git checkout -f v0.25.4

Step 2: Configure Environment Variables #

# Edit the environment file
cp .env .env.backup
nano .env

Key variables to set:

# docker/.env
RAGFLOW_IMAGE=infiniflow/ragflow:v0.25.4
SVR_HTTP_PORT=80
MYSQL_PASSWORD=your_secure_mysql_password
MINIO_PASSWORD=your_secure_minio_password
REDIS_PASSWORD=your_secure_redis_password

# Choose your document engine: elasticsearch or infinity
DOC_ENGINE=elasticsearch

Step 3: Launch with Docker Compose #

# CPU-only deployment
docker compose -f docker-compose.yml up -d

# GPU-accelerated document parsing (NVIDIA)
# sed -i '1i DEVICE=gpu' .env
# docker compose -f docker-compose.yml up -d

Verify the deployment:

# Watch the logs until you see the success message
docker logs -f ragflow-server

# Expected output:
#     ____   ___    ______ ______ __
#    / __ \ /   |  / ____// ____// /____  _      __
#   / /_/ // /| | / / __ / /_   / // __ \| | /| / /
#  / _, _// ___ |/ /_/ // __/  / // /_/ /| |/ |/ /
# /_/ |_|/_/  |_|\____//_/    /_/ \____/ |__/|__/
#  * Running on all addresses (0.0.0.0)

Step 4: Configure Your LLM Provider #

Edit service_conf.yaml.template to add your LLM API keys:

# docker/service_conf.yaml.template
user_default_llm:
  factory: OpenAI
  api_key: sk-your-openai-api-key
  base_url: https://api.openai.com/v1
  default_model: gpt-4.1-mini

Supported LLM providers include OpenAI, Anthropic, DeepSeek, Gemini, Azure OpenAI, Bedrock, and local models via Ollama or vLLM. Restart the containers after configuration changes:

docker compose -f docker-compose.yml down
docker compose -f docker-compose.yml up -d

Step 5: Access the Web UI #

Open your browser and navigate to http://YOUR_SERVER_IP. The default login is:

Email: admin@ragflow.io
Password: (set during first login)

Integration with Popular Tools #

Ollama (Local LLMs) #

For air-gapped or privacy-sensitive deployments, connect RAGFlow to Ollama:

# docker/service_conf.yaml.template
user_default_llm:
  factory: Ollama
  api_key: ""
  base_url: http://host.docker.internal:11434
  default_model: llama3.2

Pull models in Ollama before using them:

ollama pull llama3.2
ollama pull nomic-embed-text

Configure the embedding model in the RAGFlow web UI under Settings > Model Providers.

OpenAI (Cloud API) #

user_default_llm:
  factory: OpenAI
  api_key: ${OPENAI_API_KEY}
  base_url: https://api.openai.com/v1
  default_model: gpt-4.1-mini

Use environment variable substitution to avoid hardcoding secrets:

# In .env
OPENAI_API_KEY=sk-your-key

Elasticsearch to Infinity Migration #

Infinity is RAGFlow’s converged context engine optimized for large-scale deployments. To switch:

# 1. Stop all containers and clear volumes
docker compose -f docker-compose.yml down -v

# 2. Update .env
sed -i 's/DOC_ENGINE=elasticsearch/DOC_ENGINE=infinity/' .env

# 3. Restart
docker compose -f docker-compose.yml up -d

Warning: This wipes existing data. Back up your datasets before migrating.

Redis as External Cache #

For production deployments, use an external Redis cluster:

# docker-compose.yml (excerpt)
services:
  redis:
    image: redis:7-alpine
    command: redis-server --requirepass ${REDIS_PASSWORD}
    volumes:
      - redis_data:/data
    deploy:
      resources:
        limits:
          memory: 2G

Qdrant as Alternative Vector Store #

While RAGFlow uses Elasticsearch or Infinity natively, you can integrate Qdrant via the Python SDK for custom retrieval pipelines:

from qdrant_client import QdrantClient
from ragflow_sdk import RAGFlow

# Connect to both systems
ragflow = RAGFlow(api_key="your-key", base_url="http://localhost:9380")
qdrant = QdrantClient(url="http://localhost:6333")

# Custom hybrid retrieval combining RAGFlow chunks with Qdrant vectors
chunks = ragflow.retrieve(dataset_id="ds_123", query="annual revenue 2025")
vectors = qdrant.search(collection="financial_reports", vector=query_embedding, limit=5)

Benchmarks / Real-World Use Cases #

Retrieval Quality Benchmarks #

A 2026 benchmark by AI Multiple compared RAGFlow against other frameworks using 100 standardized queries with GPT-4.1-mini as the generation model:

Metric	RAGFlow	LlamaIndex	Haystack	LangChain RAG
Answer Accuracy	97%	94%	95%	91%
Avg. Retrieval Latency	420ms	380ms	450ms	510ms
Token Efficiency (per query)	1,450	1,600	1,570	2,400
Framework Overhead	8ms	6ms	5.9ms	10ms
Citation Grounding Score	96%	88%	90%	82%

RAGFlow leads in accuracy and citation grounding due to DeepDoc’s layout-aware parsing, which preserves table structures and image captions that other frameworks strip during text splitting.

Document Parsing Performance #

Document Type	RAGFlow (DeepDoc)	LlamaIndex	Haystack
PDF with tables	Full structure preserved	Flat text	Flat text
Scanned PDF (OCR)	Native support	Requires extension	Requires extension
PowerPoint slides	Slide-aware chunking	Per-slide	Per-slide
Excel spreadsheets	Cell-level extraction	CSV conversion	CSV conversion
Multi-language docs	Cross-language query	Monolingual	Monolingual

Production Deployment Profiles #

Profile	Users	Documents	Hardware	Monthly Cloud Cost
Team (10 users)	10	10,000	4 vCPU, 16 GB RAM	~$80 (DigitalOcean)
Department (100 users)	100	100,000	8 vCPU, 32 GB RAM	~$200 (DigitalOcean)
Enterprise (1000+ users)	1000+	1M+	16 vCPU, 64 GB RAM + GPU	~$800+ (cloud)

Looking for a reliable cloud host for RAGFlow? Deploy on DigitalOcean with one-click Docker setup, or use HTStack for managed container hosting with built-in monitoring.

Advanced Usage / Production Hardening #

Enable HTTPS with Reverse Proxy #

# /etc/nginx/sites-available/ragflow
server {
    listen 443 ssl http2;
    server_name ragflow.yourcompany.com;

    ssl_certificate /etc/letsencrypt/live/ragflow.yourcompany.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/ragflow.yourcompany.com/privkey.pem;

    location / {
        proxy_pass http://localhost:80;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_read_timeout 300s;
    }
}

Enable GraphRAG for Multi-Hop Reasoning #

GraphRAG extracts knowledge graphs from documents, enabling cross-document reasoning:

# Via the RAGFlow web UI or API
POST /api/datasets/{dataset_id}/chunks/graph
{
  "method": "ligh",
  "entity_types": ["PERSON", "ORGANIZATION", "PRODUCT", "EVENT"],
  "max_workers": 4
}

GraphRAG is especially effective for legal documents, research papers, and financial reports where relationships between entities span multiple pages.

Configure the Sandbox (Code Execution) #

RAGFlow’s agent can execute Python and JavaScript code in a sandboxed environment. This requires gVisor:

# Install gVisor (required for sandbox)
sudo apt-get install -y runsc

# Enable in docker-compose.yml
services:
  ragflow:
    environment:
      - ENABLE_SANDBOX=true
    devices:
      - /dev/kvm

Monitoring with Prometheus #

# Add to docker-compose.yml
services:
  prometheus:
    image: prom/prometheus:latest
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    ports:
      - "9090:9090"

  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    volumes:
      - grafana_data:/var/lib/grafana

Key metrics to monitor:

# prometheus.yml
scrape_configs:
  - job_name: 'ragflow'
    static_configs:
      - targets: ['ragflow-server:9380']
    metrics_path: /metrics

Backup Strategy #

#!/bin/bash
# /opt/ragflow/backup.sh

BACKUP_DIR="/backups/ragflow/$(date +%Y%m%d)"
mkdir -p $BACKUP_DIR

# Backup MySQL
docker exec ragflow-mysql mysqldump -u root -p$MYSQL_PASSWORD ragflow > $BACKUP_DIR/mysql.sql

# Backup Elasticsearch indices
docker exec ragflow-es curl -sX POST "localhost:9200/_snapshot/backup" \
  -H 'Content-Type: application/json' \
  -d'{"indices": "ragflow_*"}'

# Backup MinIO objects
docker exec ragflow-minio mc mirror /data $BACKUP_DIR/minio

# Sync to remote storage
rclone sync $BACKUP_DIR s3:my-backup-bucket/ragflow/

Comparison with Alternatives #

Feature	RAGFlow	LlamaIndex	Haystack	LangChain RAG
—
GitHub Stars	82,565	49,500	25,300	105,000
License	Apache-2.0	MIT	Apache-2.0	MIT
Deep Document Parsing	DeepDoc (built-in)	LlamaParse (paid)	Basic	Basic
Visual Workflow Builder	Yes	No	No	No
Built-in Web UI	Yes	No	No	No
Agent Framework	Native (with memory)	External	External	LangGraph
Hybrid Search	Full-text + Vector	Vector-only	Full-text + Vector	Depends on store
GraphRAG Support	Yes	Yes	Via extension	Via extension
OCR for Scanned Docs	Native	Paid addon	Via extension	Via extension
Code Execution Sandbox	Yes (gVisor)	No	No	No
Self-Hosted Deployment	Docker Compose	Python package	Python package	Python package
MCP Protocol Support	Yes	No	No	No
REST API	Full API	Requires building	Requires building	Requires building
Enterprise SSO	Yes	Cloud only	Cloud only	Cloud only

When to Choose What #

RAGFlow: You need a complete self-hosted RAG platform with deep document understanding, a visual UI, and built-in agents. Best for enterprises handling complex documents (PDFs, scans, spreadsheets) that want full data control.
LlamaIndex: You are building a custom Python application with specific indexing needs and prefer a library over a platform. Best for developers who need maximum flexibility and don’t mind building their own UI.
Haystack: You need a production-grade pipeline framework with strong evaluation tools and enterprise support from deepset. Best for teams that prioritize pipeline observability and testing.
LangChain RAG: You want the largest ecosystem of integrations and don’t mind assembling components yourself. Best for rapid prototyping and startups that need to iterate quickly.

Limitations / Honest Assessment #

Not a lightweight tool. RAGFlow requires a minimum of 16 GB RAM and multiple backend services (Elasticsearch, MySQL, Redis, MinIO). This is not a single-binary deployment. If you need a simple RAG setup for a side project, consider lighter alternatives like LightRAG or a direct LlamaIndex implementation.

x86 only for official images. ARM64 platforms (including Apple Silicon and AWS Graviton) require building your own Docker image from source. This adds significant time to deployment.

Learning curve for advanced features. The visual UI covers 80% of use cases, but enabling GraphRAG, configuring custom embedding pipelines, or building agents requires reading the documentation carefully.

No native multi-region replication. While you can back up and restore datasets, RAGFlow does not offer built-in cross-region replication for the underlying storage. You will need to configure MySQL and Elasticsearch replication separately.

Embedding models are not included in slim images. Starting with v0.22.0, only slim images are published. You must either download embedding models at runtime or connect to external embedding services.

Frequently Asked Questions #

What hardware do I need for a production RAGFlow deployment? #

For a production deployment serving 50+ users, use a server with at least 8 vCPU cores, 32 GB RAM, and 200 GB NVMe SSD. If you are parsing large scanned PDFs with OCR, add a GPU with at least 8 GB VRAM to accelerate DeepDoc tasks. For small teams of under 10 users, the minimum specs (4 vCPU, 16 GB RAM) are sufficient.

Can I use RAGFlow with local LLMs only? #

Yes. RAGFlow integrates with Ollama, vLLM, Xinference, and LocalAI. Configure the LLM provider in service_conf.yaml.template with the base URL of your local inference server. For embeddings, pull an embedding model through Ollama (such as nomic-embed-text) and configure it in the web UI under Model Providers.

How does RAGFlow handle scanned PDFs and images? #

RAGFlow’s DeepDoc engine includes a built-in OCR pipeline that processes scanned PDFs, PNGs, and JPEGs. It uses a document layout analysis model to identify text regions, tables, and images before applying OCR. The extracted text maintains its structural context — tables are preserved as tables, not flattened into paragraphs.

Is my data secure when using RAGFlow? #

When self-hosted, all data remains on your infrastructure. Documents are stored in MinIO, vectors in Elasticsearch/Infinity, and metadata in MySQL — all within your network. RAGFlow does not send documents to external services unless you configure a cloud LLM provider. For maximum privacy, use local LLMs and embedding models.

How do I upgrade RAGFlow to a new version? #

First, back up your MySQL database and Elasticsearch indices. Then pull the new Docker image, update the RAGFLOW_IMAGE variable in .env, and restart the containers. Always check the release notes for breaking changes between versions.

cd ragflow/docker
git fetch --tags
git checkout -f v0.25.4
# Update .env with new image tag
docker compose -f docker-compose.yml down
docker compose -f docker-compose.yml pull
docker compose -f docker-compose.yml up -d

Can I integrate RAGFlow into my existing application? #

Yes. RAGFlow exposes a full REST API and provides Python and JavaScript SDKs. You can create datasets, upload documents, start chat sessions, and retrieve answers programmatically. The API documentation is available at /api/docs on your RAGFlow instance.

What document formats does RAGFlow support? #

RAGFlow supports Word (DOC, DOCX), PowerPoint (PPT, PPTX), Excel (XLS, XLSX), PDF, TXT, Markdown, images (PNG, JPG, BMP, TIFF), scanned copies, HTML, and CSV. It also supports importing data from Confluence, Notion, Google Drive, Discord, and S3.

Does RAGFlow support multi-tenancy? #

RAGFlow supports multiple users and datasets with role-based access control within a single instance. For true multi-tenancy (isolated tenants with separate data), you currently need to run separate RAGFlow instances or implement tenant filtering at the application layer.

Self-Hosting Note #

Running this on your own VPS? Try DigitalOcean with $200 free credit — enough for 2 months of moderate self-hosting to test the setup risk-free. Best for low-medium traffic; scale to dedicated when you outgrow it.

Conclusion #

RAGFlow stands out as the only open-source RAG platform that combines deep document understanding, a production-ready web UI, and built-in agent capabilities in a single deployable system. With 82,565 GitHub stars and an active development cycle, it has proven its value for teams that need more than a basic text-splitting RAG pipeline. The Docker-based deployment takes under 30 minutes, and the hybrid search architecture delivers measurably better retrieval quality than framework-only alternatives.

Your next steps:

Clone the repository and deploy RAGFlow on your server using Docker Compose
Upload a complex PDF with tables and verify the parsing quality in the web UI
Configure your preferred LLM provider (OpenAI, DeepSeek, or Ollama)
Set up HTTPS and automated backups for production use
Join the community for support and feature updates

Join our Telegram developer community for deployment tips and production RAG discussions. Share your RAGFlow setup — we feature the best configurations in our weekly newsletter.

Some links in this article are affiliate links. We may earn a commission if you purchase hosting services through these links. This does not affect our editorial recommendations.

Recommended Hosting & Infrastructure #

Before you deploy any of the tools above into production, you’ll need solid infrastructure. Two options dibi8 actually uses and recommends:

DigitalOcean — $200 free credit for 60 days across 14+ global regions. The default option for indie devs running open-source AI tools.
HTStack — Hong Kong VPS with low-latency access from mainland China. This is the same IDC that hosts dibi8.com — battle-tested in production.

Affiliate links — they don’t cost you extra and they help keep dibi8.com running.