How do I install Chroma and run it in Python?

Run `pip install chromadb`, then create a client: `chromadb.Client()` for in-memory testing or `chromadb.PersistentClient(path="./chroma_db")` to save data on disk. For production, run the Docker image `chromadb/chroma:latest` and connect with `chromadb.HttpClient(host="localhost", port=8000)`.

How many vectors can Chroma handle on a single machine?

On a single machine with 32GB RAM, Chroma comfortably handles 5–10 million vectors at 384 dimensions. Beyond that you hit memory limits during index construction, so you should shard across instances or use the Chroma Cloud tier. Chroma has no built-in distributed clustering.

Can Chroma run offline without an internet connection or API key?

Yes. Using the SentenceTransformerEmbeddingFunction with a pre-downloaded model (such as all-MiniLM-L6-v2), Chroma operates entirely offline with no API keys or cloud calls. You can also disable telemetry by setting ANONYMIZED_TELEMETRY=FALSE, making it suitable for air-gapped environments.

How much faster is Chroma than NumPy brute-force vector search?

Chroma's HNSW index is roughly 150x faster than naive NumPy brute-force at 10,000 vectors and about 10,000x faster at 1 million vectors. NumPy is only viable below ~1,000 vectors and also lacks persistence, metadata filtering, and concurrent query support.

How do I filter Chroma query results by metadata?

Pass a `where` clause to `collection.query()`, for example `where={"source": "blog"}` for exact matches, `where={"year": {"$gte": 2025}}` for numeric comparisons, or `where={"$and": [{"category": "tutorial"}, {"difficulty": "advanced"}]}` for logical operators. Metadata is supplied as key-value pairs when you add documents.

Chroma DB 2026：适合开发者的 RAG 向量数据库，嵌入速度提升 50 倍

Chroma DB 2026：开发人员友好的 RAG 矢量数据库，嵌入速度提高 50 倍 — Python 指南 — dibi8.com

## 简介：为什么您的 RAG 管道需要更好的矢量存储您构建了一个 RAG 应用程序。它可以很好地处理 500 个文档。然后你达到 50,000，搜索开始爬行。延迟从 200 毫秒跃升至 4 秒。您的用户注意到了。您尝试使用 pgvector 来使用 PostgreSQL，但设置感觉就像配置一艘宇宙飞船。您尝试过 Pinecone，但定价的增长速度快于流量的增长速度。这正是 Chroma 解决的问题。 Chroma 是一个开发人员优先的矢量数据库，专为 90% 不需要分布式集群编排的 AI 应用程序而设计——它们需要快速的嵌入搜索、简单的设置和真正有意义的 Python API。截至 2026 年 5 月，Chroma 已突破 18,000 个 GitHub star，发布 v0.6.x，具有持久存储、元数据过滤和查询引擎，在超过 1M 向量的数据集上，其检索速度比朴素的平面索引暴力破解快 50 倍。该项目由 Chroma 团队在 Apache-2.0 下维护，是 LangChain 和 LlamaIndex 快速入门指南中的默认矢量存储。本指南可让您在 30 分钟内从“pip install”过渡到生产就绪的 RAG。无需具备矢量数据库经验。 ## 色度是什么？（一句话定义） Chroma 是一个开源的嵌入原生向量数据库，具有 Python-first API，用于存储文档及其向量嵌入，然后使用近似最近邻 (ANN) 搜索检索语义上最相似的结果。与固定在矢量扩展上的传统数据库不同，Chroma 是从头开始构建的嵌入工作流程：添加文档→生成嵌入→按含义查询。它支持内存（开发）和持久磁盘（生产）存储模式，并在 Docker 中本地运行或在具有零外部依赖性的 VPS 上运行。 ## Chroma 的工作原理：架构和核心概念 Chroma 的架构故意变得简单。理解三个核心概念可以让你成功 80%： ### 收藏集合是相关文档及其嵌入的容器。将其视为 SQL 中的表，但无模式且是矢量本机的。您为每种文档类型创建一个集合（例如“legal_docs”、“product_manuals”、“support_tickets”）。 ### 嵌入您添加的每个文档都会通过嵌入模型转换为向量（浮点数数组，通常为 384-1536 维）。 Chroma 可以使用默认模型（如“all-MiniLM-L6-v2”）自动生成嵌入，或接受来自 OpenAI、Cohere 或任何自定义模型的预先计算的向量。 ### 通过向量相似度查询当您查询时，Chroma 会将您的文本转换为相同的向量空间，然后使用 HNSW（分层可导航小世界） 索引在亚毫秒时间内找到最近的邻居。 HNSW 索引比暴力余弦相似度提供了50 倍的加速。 ### 存储模式 | 模式| 坚持| 使用案例| 性能| |

Chroma DB 2026：适合开发者的 RAG 向量数据库，嵌入速度提升 50 倍 — Python 指南

📦 出现在以下合集中

💬 留言讨论

🔗 相关资源推荐

📦 出现在以下合集中

💬 留言讨论