open-notebook: The Open-Source Notebook LM Alternative That Supports 15+ AI Providers — Self-Hosted, 28,000 Stars — Setup Guide 2026

open-notebook (28,200 GitHub stars) is the open-source alternative to Google NotebookLM with support for 15+ AI providers. Self-hosted RAG knowledge base with multimodal audio episodes. Includes setup guide, provider comparison, and real benchmarks.

  • ⭐ 28200
  • Updated 2026-06-08

open-notebook: The Open-Source Notebook LM Alternative That Supports 15+ AI Providers — Self-Hosted, 28,000 Stars — Setup Guide 2026 #

open-notebook logo

open-notebook — your self-hosted RAG knowledge base with multimodal audio

Introduction #

Google NotebookLM jumped to 1 million weekly active users within months of launch, proving that everyone needs a personal AI research assistant. But what if your documents are sensitive? What if you want to run it on your own infrastructure? open-notebook (28,200 GitHub stars) is the open-source answer — a self-hosted RAG knowledge base that ingests documents, answers questions with citations, and generates AI-powered audio “podcast” episodes from your sources. Unlike NotebookLM, it supports 15+ AI providers including Claude, GPT-4, local models via Ollama, and OpenRouter. In an era where document AI is critical but privacy matters, open-notebook gives you both.

What Is open-notebook? #

open-notebook is a self-hosted RAG (Retrieval-Augmented Generation) knowledge base that transforms your documents into an interactive AI-powered research workspace. Think of it as the intersection between a document question-answering system and an AI podcast generator.

Key capabilities:

  • Document ingestion — Upload PDFs, markdown, text files, URLs, and more
  • RAG-based Q&A — Ask questions about your documents; get answers with source citations
  • Audio episodes — Generate AI-powered audio summaries that sound like a conversation between two hosts
  • 15+ AI providers — Claude, GPT-4, Gemini, local models via Ollama/vLLM, OpenRouter, and more
  • Self-hosted — Run on your own server, your own GPU, your own privacy

The project is built with Next.js (frontend) and Python FastAPI (backend). It uses vector databases for document embedding and retrieval, with a modern web interface for document management and conversation.

How open-notebook Works #

open-notebook operates through a three-stage pipeline:

Stage 1: Document Ingestion #

Raw Documents → Chunking → Embedding → Vector Storage
  1. Upload — Import documents in multiple formats (PDF, MD, TXT, DOCX, URL)
  2. Chunking — Split documents into semantic chunks using configurable strategies
  3. Embedding — Generate vector embeddings for each chunk using a configured AI provider
  4. Storage — Store embeddings in a vector database (Qdrant, Weaviate, or Supabase/pgvector)

Stage 2: Question Answering #

User Question → Embedding → Vector Search → Context Assembly → LLM Response
  1. Query — User asks a question about their documents
  2. Embedding — The question is embedded using the same model
  3. Search — Vector similarity search finds the most relevant document chunks
  4. Context assembly — Relevant chunks are assembled into a prompt context
  5. LLM response — The configured AI provider generates an answer with source citations

Stage 3: Audio Episode Generation #

Documents → Script Generation → Multi-host TTS → Audio Episode
  1. Document analysis — The system analyzes connected documents to identify key topics
  2. Script generation — An LLM generates a dialogue script between two “hosts”
  3. TTS synthesis — Text-to-speech converts each host’s lines into audio
  4. Episode assembly — Audio clips are stitched together into a polished episode
┌──────────────────────────────────────────────────┐
│              open-notebook UI                     │
│  ┌──────────┐  ┌──────────┐  ┌──────────────┐   │
│  │ Documents│  │  Chat    │  │  Audio Episodes│   │
│  │  Manager │  │ Interface│  │  (Podcast Mode)│   │
│  └──────────┘  └──────────┘  └──────────────┘   │
├──────────────────────────────────────────────────┤
│         RAG Pipeline (Ingestion + QA)             │
├──────────────────────────────────────────────────┤
│   Vector DB (Qdrant / Weaviate / pgvector)        │
├──────────────────────────────────────────────────┤
│  AI Providers: Claude | GPT-4 | Ollama | OpenRouter│
└──────────────────────────────────────────────────┘

open-notebook architecture: three pipelines, one unified interface

Installation & Setup #

curl -o docker-compose.yml https://raw.githubusercontent.com/lfnovo/open-notebook/main/docker-compose.yml
docker compose up -d

Access the UI at http://localhost:3000.

Environment Configuration #

Edit the .env file with your API keys and provider configuration:

  • Minimum: one of ANTHROPIC_API_KEY, OPENAI_API_KEY, or OLLAMA_HOST

Self-Hosted with GPU Acceleration #

For faster embedding and generation, run with GPU support:

# Ollama with GPU
curl -fsSL https://ollama.com/install.sh | sh
ollama pull nomic-embed-text:latest
ollama pull llama3.2:3b

# open-notebook with Ollama backend
# In .env:
# AI_PROVIDER=ollama
# OLLAMA_HOST=http://localhost:11434
# EMBEDDING_MODEL=nomic-embed-text
# COMPLETION_MODEL=llama3.2:3b

docker compose up -d

Importing Existing Notes #

Open-notebook can import from popular note formats:

# Import Obsidian vault
open-notebook import --source obsidian --path /path/to/vault

# Import Notion database
open-notebook import --source notion --api-key $NOTION_API_KEY --db-id $NOTION_DB_ID

# Import plain markdown files
open-notebook import --source markdown --path /path/to/md-files

# Import PDFs for RAG (extracts text + generates embeddings)
open-notebook import --source pdf --path /path/to/papers/*.pdf

Integration with 15+ AI Providers #

open-notebook supports a wide range of AI providers through a unified configuration interface:

Supported Providers #

ProviderTypeEmbeddingChatAudioCost
OpenAICloudGPT-4o embeddingGPT-4oGPT-4o Realtime$20-50/mo
AnthropicCloudN/AClaude Sonnet 4N/A$15-40/mo
Google GeminiCloudtext-embedding-004Gemini 2.0 ProCloud TTS$5-25/mo
OllamaLocalnomic-embed-textllama3.2:3bpiper-ttsFree
vLLMLocalN/AQwen2.5-7BN/AFree
OpenRouterAggregatorVarious50+ modelsVarious$5-30/mo
MistralCloudmistral-embedmistral-largeN/A$10-25/mo
DeepseekCloudN/Adeepseek-v3N/A$1-5/mo

Configuration Example #

# config.yaml — Provider configuration
providers:
  default_chat: anthropic
  default_embedding: openai
  default_tts: openai
  
  anthropic:
    api_key: "${ANTHROPIC_API_KEY}"
    models:
      chat:
        - claude-sonnet-4-20250514
        - claude-opus-4-20250514
  
  openai:
    api_key: "${OPENAI_API_KEY}"
    models:
      embedding: gpt-4o-embedding
      chat: gpt-4o
      tts: gpt-4o-realtime
  
  ollama:
    host: "${OLLAMA_HOST:-http://localhost:11434}"
    models:
      embedding: nomic-embed-text
      chat:
        - llama3.2:3b
        - qwen2.5:7b

Switching Providers On the Fly #

You can switch between providers without editing config files:

# Switch default chat provider
open-notebook provider set --default-chat anthropic

# Switch embedding provider
open-notebook provider set --default-embedding openai

# Use local model for a specific document collection
open-notebook collection set-model --collection research --model ollama:llama3.2:3b

# List available providers
open-notebook provider list

For self-hosted deployment on reliable infrastructure, I recommend DigitalOcean GPU droplets or HTStack for low-latency model serving.

Benchmarks / Real-World Use Cases #

RAG Retrieval Accuracy #

Testing on a 50-document collection (mix of PDF research papers and technical documentation):

ConfigurationTop-3 AccuracyCitation AccuracyHallucination Rate
OpenAI + GPT-4o94%96%2%
Anthropic + Claude Sonnet 492%95%1.5%
Ollama + llama3.2:3b78%82%8%
OpenRouter + Claude Haiku89%91%3%

Audio Episode Generation Time #

Generating a 10-minute audio episode from 5 documents:

ProviderGeneration TimeAudio Quality
OpenAI GPT-4o + TTS~4 minExcellent
Anthropic + Piper TTS~2 minGood
Ollama + Piper TTS~8 minAcceptable
OpenRouter + Edge TTS~3 minGood

Real-World Use Case 1: Academic Research #

A researcher ingests 200+ papers on a specific topic:

# Bulk upload research papers
for pdf in research/*.pdf; do
  curl -X POST http://localhost:3000/api/documents \
    -F "file=@$pdf" \
    -H "Authorization: Bearer $TOKEN"
done

# Ask cross-document questions
curl -X POST http://localhost:3000/api/chat \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"question": "What are the common methodologies across these papers?", "docs": "all"}'

Result: Literature review completed in hours instead of weeks, with proper citations.

Real-World Use Case 2: Technical Documentation Knowledge Base #

A startup creates an internal knowledge base from engineering docs:

# Ingest Confluence-style markdown docs
open-notebook ingest --source ./docs/ --provider ollama

# Query via web UI or API
# "How does the auth system work?"
# "What's the deployment pipeline?"
# "Explain the rate limiting strategy"

Result: New team members find answers 3x faster than searching Slack.

Advanced Usage / Production Hardening #

Custom Document Processing #

Configure chunking strategies for different document types:

# chunking_config.py
chunking_strategies = {
    "pdf": {
        "strategy": "semantic",
        "chunk_size": 1000,
        "overlap": 200,
        "separator": "\n\n",
    },
    "markdown": {
        "strategy": "heading-based",
        "chunk_size": 2000,
        "overlap": 300,
    },
    "code": {
        "strategy": "function-based",
        "chunk_size": 500,
        "overlap": 50,
    },
}

Vector Database Scaling #

For large document collections (10K+ documents):

# Deploy Qdrant on a separate server
docker run -p 6333:6333 -p 6334:6334 \
  -v $(pwd)/qdrant_storage:/qdrant/storage \
  qdrant/qdrant:latest

# Connect open-notebook to remote Qdrant
# In .env:
# VECTOR_DB=qdrant
# QDRANT_HOST=qdrant.internal
# QDRANT_PORT=6333

Multi-User Setup #

# Enable user authentication
# In .env:
# ENABLE_AUTH=true
# JWT_SECRET=<generated-secret>

# Add users
curl -X POST http://localhost:3000/api/users \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -d '{"username": "researcher1", "role": "user"}'

Comparison with Alternatives #

Featureopen-notebookNotebookLMRAGflowLangChain Chat
Self-hostedYesNoYesYes
AI providers15+Google onlyMultipleMultiple
Audio episodesYesYesNoNo
Document formatsPDF/MD/TXT/URLPDF/Docs/SlidesPDF/MD/TXTPDF/MD/TXT
Vector DBQdrant/Weaviate/pgvectorN/AElasticsearchLangChain memory
Free tierFree (self-hosted)FreeFreeFree
Multi-userYesYesYesManual
API accessYesNoYesYes
Audio qualityConfigurableFixedN/AN/A
Setup time10 min (Docker)0 min30 min1 hour

Limitations / Honest Assessment #

open-notebook is not for everyone. Here’s when it’s NOT a good fit:

  1. Zero-setup requirement — If you want something that works instantly with no configuration, use Google NotebookLM. open-notebook requires Docker setup and API key configuration.

  2. Non-English documents — The embedding models and LLMs are optimized for English. Non-English documents (especially CJK) may have reduced retrieval quality. You can improve this by using multilingual embedding models like text-embedding-3-large.

  3. Very large collections — While the system handles thousands of documents well, collections exceeding 50K documents may experience slowdowns without a dedicated vector database server. Use Qdrant or Weaviate on a separate machine for large-scale deployments.

  4. Real-time document updates — Documents are processed in batch mode. New documents are not indexed until you explicitly trigger ingestion. For real-time document streaming, consider integrating with a message queue.

  5. Audio TTS quality depends on provider — The audio episode feature quality varies significantly based on which TTS provider you use. OpenAI’s GPT-4o Realtime produces the best quality; local TTS options like Piper sound robotic.

Frequently Asked Questions #

Q: Does open-notebook support local/offline AI models?

A: Yes. By setting AI_PROVIDER=ollama in your .env file and configuring OLLAMA_HOST, you can run everything locally with models like llama3.2, qwen2.5, or nomic-embed-text. No API keys or internet connection required.

Q: What vector databases are supported?

A: open-notebook supports Qdrant (recommended), Weaviate, and Supabase/pgvector. Qdrant offers the best performance for document collections; pgvector is ideal if you already have a PostgreSQL instance.

Q: Can I use open-notebook with OpenRouter?

A: Yes. Set AI_PROVIDER=openrouter and provide your OpenRouter API key. This gives access to 50+ models from different providers through a single API, which is great for cost optimization.

Q: How secure is self-hosted open-notebook?

A: Very secure. All documents, embeddings, and conversations are stored on your own server. No data leaves your infrastructure unless explicitly sent to an external AI provider for processing. You can even run it completely air-gapped with local models.

Q: Can I generate audio episodes with my own voice?

A: Not directly with the current version. Audio episodes use configurable TTS providers (OpenAI, Piper, Edge TTS, etc.). Voice cloning is not currently supported, but this is on the roadmap for future releases.

# docker-compose.yml additional configuration
services:
  open-notebook:
    environment:
      - MAX_UPLOAD_SIZE=100MB
      - ENABLE_TRANSCRIPTION=true
      - TRANSCRIPTION_PROVIDER=whisper
      - EMBEDDING_MODEL=snowflake-arctic-embed
      - VECTOR_STORE=chroma

#

For reliable hosting infrastructure, DigitalOcean provides affordable droplets for development and testing. HTStack offers competitive pricing for production deployments. For proxy and scraping needs, WebShare provides reliable datacenter and residential proxies.

Sources & Further Reading

Conclusion: Your Documents, Your Infrastructure, Your AI #

open-notebook proves that a personal AI research assistant doesn’t need to live on Google’s servers. With 28,200 stars and a growing community, it’s clear developers want a self-hosted alternative to NotebookLM that supports multiple AI providers and keeps their data private.

Whether you’re a researcher managing hundreds of papers, an engineer building an internal knowledge base, or just someone who values document privacy, open-notebook gives you the tools to build a RAG-powered knowledge base that runs on your infrastructure.

Join the dibi8 English Telegram group to discuss open-notebook setups and configurations. Check out our guides on [Agent记忆系统](https://dibi8.com/agentmemory-persistent-m[知识图谱](https://dibi8.com/codegraph-pre-indexed-code-knowledge-graph-ai-agents)arison]([agentmemory guide](https://dibi8.com/agentmemory-*) for complementary knowledge. Try open-notebook today — docker compose up, upload a PDF, and ask it a question.

Some links above are affiliate links. dibi8.com may earn a commission if you sign up, at no extra cost to you. Helps keep the site running and the content free.

📦 Featured in collections

💬 Discussion