RAGFlow: Deploy a Production-Ready RAG Engine with 80K+ Stars — Docker Setup and Benchmarks for 2026

RAGFlow is an open-source retrieval-augmented generation (RAG) engine with deep document understanding and built-in agent capabilities. Compatible with Ollama, OpenAI, Qdrant, Elasticsearch, and Redis. Covers Docker deployment, document ingestion, retrieval tuning, and production hardening.

  • ⭐ 80853
  • Apache-2.0
  • Updated 2026-05-19

{{< resource-info >}}

RAGFlow Logo

Introduction #

Most RAG pipelines fail in production because they treat PDFs and PowerPoints as plain text. Tables get mangled, image captions disappear, and scanned documents become unreadable. The result is a retrieval system that returns garbage context to your LLM — and hallucinated answers follow. RAGFlow, an open-source RAG engine with over 80,000 GitHub stars, was built to solve exactly this. Its DeepDoc engine understands document layouts, and its built-in agent framework lets you build autonomous knowledge workers, not just chatbots. In this guide, you will deploy RAGFlow on a production server, configure document ingestion, tune retrieval quality, and integrate it with your existing LLM stack.

What Is RAGFlow? #

RAGFlow is an open-source retrieval-augmented generation engine that combines deep document understanding with LLM-powered agents to deliver truthful, cited answers from complex enterprise documents. Unlike generic RAG libraries that rely on simple text splitting, RAGFlow analyzes document structure — tables, images, headers, and scanned pages — to preserve semantic meaning during ingestion. It ships as a Docker-based platform with a web UI, REST API, and agent builder, making it suitable for both developers building pipelines and non-technical users managing knowledge bases.

How RAGFlow Works #

RAGFlow System Architecture

RAGFlow’s architecture follows a modular pipeline design with six core stages:

1. Document Ingestion (DeepDoc) #

Documents enter RAGFlow through the DeepDoc parsing engine. DeepDoc performs layout analysis on PDFs, Word files, Excel sheets, PowerPoint slides, images, and scanned copies. It identifies tables, figures, headers, paragraphs, and text blocks using a vision-based document layout model. This stage also supports external parsers like MinerU and Docling for specialized formats.

2. Knowledge Extraction and Chunking #

After parsing, RAGFlow applies template-based chunking strategies. You can choose from multiple chunking modes — naive, manual, Q&A, table, paper, book, laws, presentation, picture, and one — depending on your document type. Each chunk preserves its structural context, and RAGFlow optionally extracts keywords and generates related questions to improve retrieval recall.

3. Indexing (Hybrid Search Backend) #

Chunks are indexed into either Elasticsearch (default) or Infinity for hybrid search. Both full-text search and dense vector search are supported. The system computes embeddings using configurable embedding models and stores vectors alongside inverted indices for keyword retrieval.

4. Retrieval and Re-ranking #

When a query arrives, RAGFlow performs multi-channel retrieval: keyword search, vector similarity search, and knowledge graph traversal (if GraphRAG is enabled). Results are fused and re-ranked using a cross-encoder re-ranker before being passed to the LLM context window.

5. Generation with Citations #

RAGFlow constructs a prompt that includes retrieved chunks with traceable citations. The LLM generates an answer grounded in the retrieved context, and RAGFlow displays the source chunks alongside the response so users can verify every claim.

6. Agent Execution (Optional) #

Beyond simple question answering, RAGFlow’s agent framework supports multi-step workflows with memory, tool calling, MCP (Model Context Protocol) integration, and code execution in sandboxed environments. Agents can browse the web, query databases, and chain multiple retrieval operations.

Infrastructure Stack #

ServicePurposeDefault Backend
Vector + Full-Text StoreDocument indexing and searchElasticsearch or Infinity
Object StorageFile storage for uploaded documentsMinIO
Metadata DatabaseUser data, dataset configs, chat historyMySQL
Cache & QueueTask queuing and session cachingRedis
Document ParserLayout analysis and OCRDeepDoc (built-in)
FrontendWeb UI for dataset managementReact + TypeScript
Backend APICore orchestration logicPython + Go

Installation & Setup #

Hardware Requirements #

ResourceMinimumRecommended for Production
CPU4 cores (x86_64)8+ cores
RAM16 GB32+ GB
Disk50 GB SSD200+ GB NVMe
Docker24.0.0+Latest stable
Docker Composev2.26.1+Latest stable

Note: RAGFlow officially supports x86_64 platforms. ARM64 is community-tested but requires building your own Docker image.

Pre-Deployment: System Tuning #

Before starting RAGFlow, ensure your kernel parameters are tuned for Elasticsearch:

# Check current vm.max_map_count
sysctl vm.max_map_count

# Set to at least 262144 (required by Elasticsearch)
sudo sysctl -w vm.max_map_count=262144

# Persist across reboots
echo "vm.max_map_count=262144" | sudo tee -a /etc/sysctl.conf

Step 1: Clone the Repository #

git clone https://github.com/infiniflow/ragflow.git
cd ragflow/docker
git checkout -f v0.25.4

Step 2: Configure Environment Variables #

# Edit the environment file
cp .env .env.backup
nano .env

Key variables to set:

# docker/.env
RAGFLOW_IMAGE=infiniflow/ragflow:v0.25.4
SVR_HTTP_PORT=80
MYSQL_PASSWORD=your_secure_mysql_password
MINIO_PASSWORD=your_secure_minio_password
REDIS_PASSWORD=your_secure_redis_password

# Choose your document engine: elasticsearch or infinity
DOC_ENGINE=elasticsearch

Step 3: Launch with Docker Compose #

# CPU-only deployment
docker compose -f docker-compose.yml up -d

# GPU-accelerated document parsing (NVIDIA)
# sed -i '1i DEVICE=gpu' .env
# docker compose -f docker-compose.yml up -d

Verify the deployment:

# Watch the logs until you see the success message
docker logs -f ragflow-server

# Expected output:
#     ____   ___    ______ ______ __
#    / __ \ /   |  / ____// ____// /____  _      __
#   / /_/ // /| | / / __ / /_   / // __ \| | /| / /
#  / _, _// ___ |/ /_/ // __/  / // /_/ /| |/ |/ /
# /_/ |_|/_/  |_|\____//_/    /_/ \____/ |__/|__/
#  * Running on all addresses (0.0.0.0)

Step 4: Configure Your LLM Provider #

Edit service_conf.yaml.template to add your LLM API keys:

# docker/service_conf.yaml.template
user_default_llm:
  factory: OpenAI
  api_key: sk-your-openai-api-key
  base_url: https://api.openai.com/v1
  default_model: gpt-4.1-mini

Supported LLM providers include OpenAI, Anthropic, DeepSeek, Gemini, Azure OpenAI, Bedrock, and local models via Ollama or vLLM. Restart the containers after configuration changes:

docker compose -f docker-compose.yml down
docker compose -f docker-compose.yml up -d

Step 5: Access the Web UI #

Open your browser and navigate to http://YOUR_SERVER_IP. The default login is:

Email: admin@ragflow.io
Password: (set during first login)

RAGFlow Web Interface

Ollama (Local LLMs) #

For air-gapped or privacy-sensitive deployments, connect RAGFlow to Ollama:

# docker/service_conf.yaml.template
user_default_llm:
  factory: Ollama
  api_key: ""
  base_url: http://host.docker.internal:11434
  default_model: llama3.2

Pull models in Ollama before using them:

ollama pull llama3.2
ollama pull nomic-embed-text

Configure the embedding model in the RAGFlow web UI under Settings > Model Providers.

OpenAI (Cloud API) #

user_default_llm:
  factory: OpenAI
  api_key: ${OPENAI_API_KEY}
  base_url: https://api.openai.com/v1
  default_model: gpt-4.1-mini

Use environment variable substitution to avoid hardcoding secrets:

# In .env
OPENAI_API_KEY=sk-your-key

Elasticsearch to Infinity Migration #

Infinity is RAGFlow’s converged context engine optimized for large-scale deployments. To switch:

# 1. Stop all containers and clear volumes
docker compose -f docker-compose.yml down -v

# 2. Update .env
sed -i 's/DOC_ENGINE=elasticsearch/DOC_ENGINE=infinity/' .env

# 3. Restart
docker compose -f docker-compose.yml up -d

Warning: This wipes existing data. Back up your datasets before migrating.

Redis as External Cache #

For production deployments, use an external Redis cluster:

# docker-compose.yml (excerpt)
services:
  redis:
    image: redis:7-alpine
    command: redis-server --requirepass ${REDIS_PASSWORD}
    volumes:
      - redis_data:/data
    deploy:
      resources:
        limits:
          memory: 2G

Qdrant as Alternative Vector Store #

While RAGFlow uses Elasticsearch or Infinity natively, you can integrate Qdrant via the Python SDK for custom retrieval pipelines:

from qdrant_client import QdrantClient
from ragflow_sdk import RAGFlow

# Connect to both systems
ragflow = RAGFlow(api_key="your-key", base_url="http://localhost:9380")
qdrant = QdrantClient(url="http://localhost:6333")

# Custom hybrid retrieval combining RAGFlow chunks with Qdrant vectors
chunks = ragflow.retrieve(dataset_id="ds_123", query="annual revenue 2025")
vectors = qdrant.search(collection="financial_reports", vector=query_embedding, limit=5)

Benchmarks / Real-World Use Cases #

Retrieval Quality Benchmarks #

A 2026 benchmark by AI Multiple compared RAGFlow against other frameworks using 100 standardized queries with GPT-4.1-mini as the generation model:

MetricRAGFlowLlamaIndexHaystackLangChain RAG
Answer Accuracy97%94%95%91%
Avg. Retrieval Latency420ms380ms450ms510ms
Token Efficiency (per query)1,4501,6001,5702,400
Framework Overhead8ms6ms5.9ms10ms
Citation Grounding Score96%88%90%82%

RAGFlow leads in accuracy and citation grounding due to DeepDoc’s layout-aware parsing, which preserves table structures and image captions that other frameworks strip during text splitting.

Document Parsing Performance #

Document TypeRAGFlow (DeepDoc)LlamaIndexHaystack
PDF with tablesFull structure preservedFlat textFlat text
Scanned PDF (OCR)Native supportRequires extensionRequires extension
PowerPoint slidesSlide-aware chunkingPer-slidePer-slide
Excel spreadsheetsCell-level extractionCSV conversionCSV conversion
Multi-language docsCross-language queryMonolingualMonolingual

Production Deployment Profiles #

ProfileUsersDocumentsHardwareMonthly Cloud Cost
Team (10 users)1010,0004 vCPU, 16 GB RAM~$80 (DigitalOcean)
Department (100 users)100100,0008 vCPU, 32 GB RAM~$200 (DigitalOcean)
Enterprise (1000+ users)1000+1M+16 vCPU, 64 GB RAM + GPU~$800+ (cloud)

Looking for a reliable cloud host for RAGFlow? Deploy on DigitalOcean with one-click Docker setup, or use HTStack for managed container hosting with built-in monitoring.

Advanced Usage / Production Hardening #

Enable HTTPS with Reverse Proxy #

# /etc/nginx/sites-available/ragflow
server {
    listen 443 ssl http2;
    server_name ragflow.yourcompany.com;

    ssl_certificate /etc/letsencrypt/live/ragflow.yourcompany.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/ragflow.yourcompany.com/privkey.pem;

    location / {
        proxy_pass http://localhost:80;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_read_timeout 300s;
    }
}

Enable GraphRAG for Multi-Hop Reasoning #

GraphRAG extracts knowledge graphs from documents, enabling cross-document reasoning:

# Via the RAGFlow web UI or API
POST /api/datasets/{dataset_id}/chunks/graph
{
  "method": "ligh",
  "entity_types": ["PERSON", "ORGANIZATION", "PRODUCT", "EVENT"],
  "max_workers": 4
}

GraphRAG is especially effective for legal documents, research papers, and financial reports where relationships between entities span multiple pages.

Configure the Sandbox (Code Execution) #

RAGFlow’s agent can execute Python and JavaScript code in a sandboxed environment. This requires gVisor:

# Install gVisor (required for sandbox)
sudo apt-get install -y runsc

# Enable in docker-compose.yml
services:
  ragflow:
    environment:
      - ENABLE_SANDBOX=true
    devices:
      - /dev/kvm

Monitoring with Prometheus #

# Add to docker-compose.yml
services:
  prometheus:
    image: prom/prometheus:latest
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    ports:
      - "9090:9090"

  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    volumes:
      - grafana_data:/var/lib/grafana

Key metrics to monitor:

# prometheus.yml
scrape_configs:
  - job_name: 'ragflow'
    static_configs:
      - targets: ['ragflow-server:9380']
    metrics_path: /metrics

Backup Strategy #

#!/bin/bash
# /opt/ragflow/backup.sh

BACKUP_DIR="/backups/ragflow/$(date +%Y%m%d)"
mkdir -p $BACKUP_DIR

# Backup MySQL
docker exec ragflow-mysql mysqldump -u root -p$MYSQL_PASSWORD ragflow > $BACKUP_DIR/mysql.sql

# Backup Elasticsearch indices
docker exec ragflow-es curl -sX POST "localhost:9200/_snapshot/backup" \
  -H 'Content-Type: application/json' \
  -d'{"indices": "ragflow_*"}'

# Backup MinIO objects
docker exec ragflow-minio mc mirror /data $BACKUP_DIR/minio

# Sync to remote storage
rclone sync $BACKUP_DIR s3:my-backup-bucket/ragflow/

Comparison with Alternatives #

FeatureRAGFlowLlamaIndexHaystackLangChain RAG
GitHub Stars80,85349,50025,300105,000
LicenseApache-2.0MITApache-2.0MIT
Deep Document ParsingDeepDoc (built-in)LlamaParse (paid)BasicBasic
Visual Workflow BuilderYesNoNoNo
Built-in Web UIYesNoNoNo
Agent FrameworkNative (with memory)ExternalExternalLangGraph
Hybrid SearchFull-text + VectorVector-onlyFull-text + VectorDepends on store
GraphRAG SupportYesYesVia extensionVia extension
OCR for Scanned DocsNativePaid addonVia extensionVia extension
Code Execution SandboxYes (gVisor)NoNoNo
Self-Hosted DeploymentDocker ComposePython packagePython packagePython package
MCP Protocol SupportYesNoNoNo
REST APIFull APIRequires buildingRequires buildingRequires building
Enterprise SSOYesCloud onlyCloud onlyCloud only

When to Choose What #

  • RAGFlow: You need a complete self-hosted RAG platform with deep document understanding, a visual UI, and built-in agents. Best for enterprises handling complex documents (PDFs, scans, spreadsheets) that want full data control.
  • LlamaIndex: You are building a custom Python application with specific indexing needs and prefer a library over a platform. Best for developers who need maximum flexibility and don’t mind building their own UI.
  • Haystack: You need a production-grade pipeline framework with strong evaluation tools and enterprise support from deepset. Best for teams that prioritize pipeline observability and testing.
  • LangChain RAG: You want the largest ecosystem of integrations and don’t mind assembling components yourself. Best for rapid prototyping and startups that need to iterate quickly.

Limitations / Honest Assessment #

Not a lightweight tool. RAGFlow requires a minimum of 16 GB RAM and multiple backend services (Elasticsearch, MySQL, Redis, MinIO). This is not a single-binary deployment. If you need a simple RAG setup for a side project, consider lighter alternatives like LightRAG or a direct LlamaIndex implementation.

x86 only for official images. ARM64 platforms (including Apple Silicon and AWS Graviton) require building your own Docker image from source. This adds significant time to deployment.

Learning curve for advanced features. The visual UI covers 80% of use cases, but enabling GraphRAG, configuring custom embedding pipelines, or building agents requires reading the documentation carefully.

No native multi-region replication. While you can back up and restore datasets, RAGFlow does not offer built-in cross-region replication for the underlying storage. You will need to configure MySQL and Elasticsearch replication separately.

Embedding models are not included in slim images. Starting with v0.22.0, only slim images are published. You must either download embedding models at runtime or connect to external embedding services.

Frequently Asked Questions #

What hardware do I need for a production RAGFlow deployment? #

For a production deployment serving 50+ users, use a server with at least 8 vCPU cores, 32 GB RAM, and 200 GB NVMe SSD. If you are parsing large scanned PDFs with OCR, add a GPU with at least 8 GB VRAM to accelerate DeepDoc tasks. For small teams of under 10 users, the minimum specs (4 vCPU, 16 GB RAM) are sufficient.

Can I use RAGFlow with local LLMs only? #

Yes. RAGFlow integrates with Ollama, vLLM, Xinference, and LocalAI. Configure the LLM provider in service_conf.yaml.template with the base URL of your local inference server. For embeddings, pull an embedding model through Ollama (such as nomic-embed-text) and configure it in the web UI under Model Providers.

How does RAGFlow handle scanned PDFs and images? #

RAGFlow’s DeepDoc engine includes a built-in OCR pipeline that processes scanned PDFs, PNGs, and JPEGs. It uses a document layout analysis model to identify text regions, tables, and images before applying OCR. The extracted text maintains its structural context — tables are preserved as tables, not flattened into paragraphs.

Is my data secure when using RAGFlow? #

When self-hosted, all data remains on your infrastructure. Documents are stored in MinIO, vectors in Elasticsearch/Infinity, and metadata in MySQL — all within your network. RAGFlow does not send documents to external services unless you configure a cloud LLM provider. For maximum privacy, use local LLMs and embedding models.

How do I upgrade RAGFlow to a new version? #

First, back up your MySQL database and Elasticsearch indices. Then pull the new Docker image, update the RAGFLOW_IMAGE variable in .env, and restart the containers. Always check the release notes for breaking changes between versions.

cd ragflow/docker
git fetch --tags
git checkout -f v0.25.4
# Update .env with new image tag
docker compose -f docker-compose.yml down
docker compose -f docker-compose.yml pull
docker compose -f docker-compose.yml up -d

Can I integrate RAGFlow into my existing application? #

Yes. RAGFlow exposes a full REST API and provides Python and JavaScript SDKs. You can create datasets, upload documents, start chat sessions, and retrieve answers programmatically. The API documentation is available at /api/docs on your RAGFlow instance.

What document formats does RAGFlow support? #

RAGFlow supports Word (DOC, DOCX), PowerPoint (PPT, PPTX), Excel (XLS, XLSX), PDF, TXT, Markdown, images (PNG, JPG, BMP, TIFF), scanned copies, HTML, and CSV. It also supports importing data from Confluence, Notion, Google Drive, Discord, and S3.

Does RAGFlow support multi-tenancy? #

RAGFlow supports multiple users and datasets with role-based access control within a single instance. For true multi-tenancy (isolated tenants with separate data), you currently need to run separate RAGFlow instances or implement tenant filtering at the application layer.

Conclusion #

RAGFlow stands out as the only open-source RAG platform that combines deep document understanding, a production-ready web UI, and built-in agent capabilities in a single deployable system. With 80,853 GitHub stars and an active development cycle, it has proven its value for teams that need more than a basic text-splitting RAG pipeline. The Docker-based deployment takes under 30 minutes, and the hybrid search architecture delivers measurably better retrieval quality than framework-only alternatives.

Your next steps:

  1. Clone the repository and deploy RAGFlow on your server using Docker Compose
  2. Upload a complex PDF with tables and verify the parsing quality in the web UI
  3. Configure your preferred LLM provider (OpenAI, DeepSeek, or Ollama)
  4. Set up HTTPS and automated backups for production use
  5. Join the community for support and feature updates

Join our Telegram developer community for deployment tips and production RAG discussions. Share your RAGFlow setup — we feature the best configurations in our weekly newsletter.

Some links in this article are affiliate links. We may earn a commission if you purchase hosting services through these links. This does not affect our editorial recommendations.

Before you deploy any of the tools above into production, you’ll need solid infrastructure. Two options dibi8 actually uses and recommends:

  • DigitalOcean — $200 free credit for 60 days across 14+ global regions. The default option for indie devs running open-source AI tools.
  • HTStack — Hong Kong VPS with low-latency access from mainland China. This is the same IDC that hosts dibi8.com — battle-tested in production.

Affiliate links — they don’t cost you extra and they help keep dibi8.com running.

Sources & Further Reading #

💬 Discussion