lang: zh slug: superagent-ai-agent-framework title: ‘Superagent: Deploy AI Agents to Production with 1 CLI Command’ description: ‘A hands-on guide to deploying AI agents with Superagent. One CLI command, multiple LLM support, RAG workflows, vector DB integration, and REST API deployment. Backed by real benchmarks.’ tags: [“ai-agent”, “automation”, “guide”, “knowledge-base”, “llm”, “open-source”, “rag”, “reference”, “retrieval”, “tutorial”] date: 2026-05-19 00:00:00+08:00 lastmod: 2026-05-19 00:00:00+08:00 tech_stack: [] application_domain: Llm Frameworks source_version: ’' licensing_model: Open Source license_type: MIT file_size: ’' file_md5: ’' download_url: ’' backup_url: ’' github_repo: ‘superagent-ai/superagent’ last_maintained: ‘2026-05-19’ draft: false categories: [’llm-frameworks’] aliases:- /posts/superagent-ai-agent-framework/ faqs:
- q: ‘What is Superagent and what problem does it solve?’ a: ‘Superagent is an open-source (MIT-licensed) framework for building, managing, and deploying AI agents behind a REST API, often with a single CLI command. It provides the infrastructure layer teams usually build themselves — memory management, vector database connections, tool orchestration, streaming responses, and auto-generated API endpoints — so developers can focus on agent logic instead of deployment plumbing.’
- q: ‘What vector databases does Superagent support?’ a: ‘As of v0.4.x, Superagent supports Pinecone, Weaviate, Qdrant, Chroma, and PostgreSQL with pgvector. Support for Milvus and Redis Vector is on the project roadmap.’
- q: ‘Can I use Superagent with local models like Llama or Mistral?’ a: ‘Yes. Any model exposed through an OpenAI-compatible API works, including local inference servers such as Ollama, vLLM, and LM Studio. You set the base_url to your local inference server endpoint.’
- q: ‘Can Superagent be self-hosted without using Superagent Cloud?’ a: ‘Yes. The entire stack is open-source under the MIT license and can be self-hosted using Docker, PostgreSQL, and Redis. The CLI works against a self-hosted instance by pointing it with ‘‘superagent config set api.url=https://your-instance.com’’.’
- q: ‘How does Superagent compare to LangChain?’ a: ‘LangChain is a library for composing LLM applications, while Superagent is a deployment framework that uses LangChain concepts but adds the API layer, vector DB management, and hosting. LangChain offers more flexibility and a far larger community (~106,000 stars vs Superagent’’s ~6,100), but requires you to build your own API layer.’
featureImage: /images/articles/superagent-deploy-ai-agents-to-productio.png —{{< resource-info >}} Flowise: Build AI Agents Visually with 52K+ Stars • Haystack 2026: The End-to-End NLP Framework for Production RAG & Agent Pipelines \u2014 Setup Guide
h
e
core components are:1. **Agents** — The reasoning unit. Each agent is bound to an LLM, a set of tools, and a memory backend.
2. **Tools** — Functions the agent can invoke (web search, API calls, code execution, database queries).
3. **Datasources** — Documents or APIs that feed the RAG pipeline, automatically chunked and vectorized.
4. **Workflows** — Multi-step automations that chain agents, tools, and conditional logic.
5. **API** — Auto-generated REST endpoints with OpenAPI docs for every agent and workflow you create.---## Installation & Setup: From Zero to Running Agent in 5 Minutes### Step 1: Install the CLI and SDK```
bas
h
npm install -g superagent-cli# Verify installation
superagent --version
# Output: superagent/0.4.2 linux-x64 node-v20.12.0
```T
h
e
CLI is the fastest path to deployment. Alternatively, install the Python SDK if you prefer programmatic control:```
bas
h
# Install Python SDK
pip install superagent-py# Or install from source for the latest features
git clone https://github.com/superagent-ai/superagent.git
cd superagent/libs/superagent-py
pip install -e .
```### Step 2: Configure Environment Variables```
bas
h
# Create a .env file in your project root
cat > .env << 'EOF'
OPENAI_API_KEY=sk-your-openai-key-here
SUPERAGENT_API_URL=https://api.superagent.sh
SUPERAGENT_API_KEY=sa-your-superagent-key# Optional: Vector database credentials
PINECONE_API_KEY=your-pinecone-key
PINECONE_ENVIRONMENT=us-east-1# Optional: For local development with Ollama
OLLAMA_BASE_URL=http://localhost:11434
EOF
```### Step 3: Deploy Your First Agent```
bas
h
# Login to Superagent Cloud (or self-hosted instance)
superagent login# Create a new project directory
mkdir my-first-agent && cd my-first-agent# Initialize with a template
superagent init --template qa-agent# Deploy to production
superagent deploy
```Af
t
e
r
`superagent deploy`, you receive a live API endpoint:```
✅ Agent deployed successfully!
🔗 API Endpoint: https://api.superagent.sh/v1/agents/ag_01hwxyz123
📖 Docs: https://api.superagent.sh/v1/ag```
bas
h
npm install -g superagent-cli
# Verify installation
superagent --version
# Output: superagent/0.4.2 linux-x64 node-v20.12.0
```t
.sh/v1/agents/ag_01hwxyz123/invoke \
-H "Authorization: Bearer $SUPERAGENT_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input": "What are the key features of Superagent?",
"enableStreaming": false
}'
```T
h
e
response inc```
bas
h
# Install Python SDK
pip install superagent-py
# Or install from source for the latest features
git clone https://github.com/superagent-ai/superagent.git
cd superagent/libs/superagent-py
pip install -e .
```, (3) Built-in RAG with vector database integration, (4) REST API with streaming support, (5) Python and TypeScript SDKs, and (6) Workflow automation for chaining agents.",
"intermediate_steps": [],
"total_tokens": 142,
"total_cost": 0.0021
}
```--```
bas
h
# Create a .env file in your project root
cat > .env << 'EOF'
OPENAI_API_KEY=sk-your-openai-key-here
SUPERAGENT_API_URL=https://api.superagent.sh
SUPERAGENT_API_KEY=sa-your-superagent-key
# Optional: Vector database credentials
PINECONE_API_KEY=your-pinecone-key
PINECONE_ENVIRONMENT=us-east-1
# Optional: For local development with Ollama
OLLAMA_BASE_URL=http://localhost:11434
EOF
```g
retrieved documents",
llm_model="gpt-4o",
api_key=os.getenv("OPENAI_API_KEY")
)# Switch to Claude 3.5 Sonnet
agent_claude = client.agent.create(
name="Research Assistant (Claude)",
llm_model="claude-3-5-sonnet-20241022",
api_key=os.getenv("ANTHROPIC_API_KEY")
)
```### LangChain IntegrationSuperagent can ingest any LangChain tool or chain, making migration straightforward:```
pytho
n
from langchain```
bas
h
# Login to Superagent Cloud (or self-hosted instance)
superagent login
# Create a new project directory
mkdir my-first-agent && cd my-first-agent
# Initialize with a template
superagent init --template qa-agent
# Deploy to production
superagent deploy
```w
e
b
for current information",
"langchain_tool": search # Pass LangChain tool directly
}]
)
```### Pinecone / Weaviate Vector DatabasesConnect your existing vector store for RAG workflows:```
pytho
n
import os
from superagent.client import Superagentclient = Superagent()# Connect to Pinecone for docum```
✅ Agent deployed successfully!
🔗 API Endpoint: https://api.superagent.sh/v1/agents/ag_01hwxyz123
📖 Docs: https://api.superagent.sh/v1/agents/ag_01hwxyz123/docs
```CO
N
E
_API_KEY"),
"pinecone_index_name": "company-docs",
"pinecone_environment": "us-east-1"
}
)# Or use Weaviate
datasource_weaviate = client.datasource.create(
name="Product```
bas
h
# Query your agent via curl
curl -X POST https://api.superagent.sh/v1/agents/ag_01hwxyz123/invoke \
-H "Authorization: Bearer $SUPERAGENT_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input": "What are the key features of Superagent?",
"enableStreaming": false
}'
```:```
pytho
n
# FastAPI integration example
from fastapi import FastAPI
from superagent.client import Superagent
import osapp = FastAPI()
client = Superagent(api_key=os.getenv("SUPERAGENT_API_KEY"))@app.post("/api/ask")
async def ask_question(question: str):
response = await client.agent.invoke(
agent_id="ag_01hwxyz123",
input=question,
enable_streaming=True
)
jso n { “output”: “Superagent provides: (1) One-command deployment, (2) Multi-LLM support including OpenAI and local models, (3) Built-in RAG with vector database integration, (4) REST API with streaming support, (5) Python and TypeScript SDKs, and (6) Workflow automation for chaining agents.”, “intermediate_steps”: [], “total_tokens”: 142, “total_cost”: 0.0021 }
e
n
t
\
-e NEXTAUTH_SECRET=$(openssl rand -hex 32) \
superagentai/superagent:latest# Verify the container is running
docker ps | grep superagent
```F
o
r
production, deploy on a DigitalOcean Droplet
with Docker Compose:```
yam
l
# docker-compose.yml for production
version: "3.8"
services:
superagent:
image: superagentai/superagent:latest
ports:
- "3000:3000"
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- DATABASE_URL=postgresql://postgres:p```
pytho
n
from superagent.client import Superagent
client = Superagent()
# Create an agent with GPT-4o
agent = client.agent.create(
name="Research Assistant",
description="Answers questions using retrieved documents",
llm_model="gpt-4o",
api_key=os.getenv("OPENAI_API_KEY")
)
# Switch to Claude 3.5 Sonnet
agent_claude = client.agent.create(
name="Research Assistant (Claude)",
llm_model="claude-3-5-sonnet-20241022",
api_key=os.getenv("ANTHROPIC_API_KEY")
)
```a
g
e
-based. As of early 2026, the token rates for Guard, Verify, and Redact models are:| Service | Input Tokens | Output Tokens |
|---------|-------------|---------------|
| Guard | $0.90 / million | $1.90 / million |
| Verify | $0.90 / million | $1.90 / million |
| Redact | $0.90 / million | $1.90 / million |### Performance Characteristics| Metric | Value | Notes |
|--------|-------|-------|
| API P95 latency | ~350ms | For simple Q&A with GPT-4o |
| Streaming TTFT | ~120ms | Time to first token with streaming enabled |
| RAG retrieval accuracy | ~87% | With Pinecone, top-5 chunks on inte```
pytho
n
from langchain.tools import DuckDuckGoSearchRun
from superagent.client import Superagent
search = DuckDuckGoSearchRun()
client = Superagent()
agent = client.agent.create(
name="Web Search Agent",
tools=[{
"name": "web_search",
"description": "Search the web for current information",
"langchain_tool": search # Pass LangChain tool directly
}]
)
```m
a
n
agents dropped from 34% to 12% after RAG tuning.**Case 2 — Internal Knowledge Base:** A 200-person SaaS company connected Superagent to their Notion workspace, Slack history, and GitHub issues. Employees reduced "where is X documented?" Slack messages by **61%** within the first month.**Case 3 — Content Generation Pipeline:** A marketing agency chained three Superagent agents — research, drafting, and review — into a workflow that produces blog post drafts. Output increased fr```
pytho
n
import os
from superagent.client import Superagent
client = Superagent()
# Connect to Pinecone for document retrieval
datasource = client.datasource.create(
name="Company Knowledge Base",
type="PINECONE",
metadata={
"pinecone_api_key": os.getenv("PINECONE_API_KEY"),
"pinecone_index_name": "company-docs",
"pinecone_environment": "us-east-1"
}
)
# Or use Weaviate
datasource_weaviate = client.datasource.create(
name="Product Docs",
type="WEAVIATE",
metadata={
"weaviate_url": "https://my-cluster.weaviate.network",
"weaviate_api_key": os.getenv("WEAVIATE_API_KEY"),
"class_name": "Document"
}
)
```i
c
e
",
description="Get the current stock price for a given ticker symbol",
function=get_stock_price
)
```### Memory Management StrategiesSuperagent supports multiple memory backends. Choose based on your use case:```
pytho
n
from superagent.client import Superagentclient = Superagent()# Option 1: Conversation buffer (default, sliding window)
agent = client.agent.create(
name="Chat Agent",
memory={"type": "conversation_buffer", "k": 10}
)# Option 2: Vector memory (semantic retrieval of past turns)
agent = client.agent.create(
name="Long Context Agent",
memory={"type": "vector_memory", "vector_db": "pinecone"}
)# Option 3: Redis-backed session memory (for multi-user apps)
agent = client.agent.create(
name="Multi-User Agent",
m```
pytho
n
# FastAPI integration example
from fastapi import FastAPI
from superagent.client import Superagent
import os
app = FastAPI()
client = Superagent(api_key=os.getenv("SUPERAGENT_API_KEY"))
@app.post("/api/ask")
async def ask_question(question: str):
response = await client.agent.invoke(
agent_id="ag_01hwxyz123",
input=question,
enable_streaming=True
)
return {"answer": response.output}
```js
o
n
"output_key": "research_notes"
},
{
"agent": "writer-agent",
"input": "Write a blog post based on: {{research_notes}}",
"output_key": "draft"
},
{
"agent": "editor-agent",
"input": "Review and improve: {{draft}}",
"output_key": "final_post"
}
]
)# Execute the workflow
result = client.workflow.invoke(
workflow_id=workflow.id,
inputs={"topic": "AI Agent Deployment Best Practices"}
)
pr```
bas
h
# Pull the official image
docker pull superagentai/superagent:latest
# Run with environment variables
docker run -d \
--name superagent \
-p 3000:3000 \
-e OPENAI_API_KEY=$OPENAI_API_KEY \
-e DATABASE_URL=postgresql://user:pass@db:5432/superagent \
-e NEXTAUTH_SECRET=$(openssl rand -hex 32) \
superagentai/superagent:latest
# Verify the container is running
docker ps | grep superagent
```## Health Checks and Monitoring```
bas
h
# Built-in health endpoint
curl https://your-superagent-instance.com/health# Expected response:
# {"status": "ok", "version": "0.4.2", "uptime": 86400}# Prometheus metrics endpoint (when enabled)
curl https://your-superagent-instance.com/metrics
```---## Comparison with Alternatives| Feature | Superagent | LangChain | AutoGen | CrewAI |
|---------|-----------|-----------|---------|--------|
| **Deployment model** | CLI + Cloud | Library only | Library only | Library + CLI |
| **REST API generation** ```
yam
l
# docker-compose.yml for production
version: "3.8"
services:
superagent:
image: superagentai/superagent:latest
ports:
- "3000:3000"
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- DATABASE_URL=postgresql://postgres:postgres@db:5432/superagent
- NEXTAUTH_SECRET=${NEXTAUTH_SECRET}
depends_on:
- db
- redis
db:
image: postgres:16-alpine
volumes:
- pgdata:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=postgres
- POSTGRES_DB=superagent
redis:
image: redis:7-alpine
volumes:
- redisdata:/data
volumes:
pgdata:
redisdata:
```c
e
(free) | Open source + paid |
| **GitHub Stars** | ~6,100 | ~106,000 | ~43,100 | ~26,700 |**When to choose Superagent:**- You need **API-first deployment** without writing Flask/FastAPI boilerplate
- You want **built-in RAG** with minimal configuration
- Your team uses **both Python and TypeScript**
- You prefer **managed infrastructure** over self-hosting everything**When to choose something else:**- Choose **LangChain** if you need maximum flexibility and do not mind writing your own API layer
- Choose **AutoGen** if multi-agent conversational patterns are your primary need
- Choose **CrewAI** if you prefer a role-based agent abstraction with less ceremony---## Limitations: An Honest Assessment**1. Smaller ecosystem than LangChain.** With ~6,100 stars versus LangChain's ~106,000, the community is smaller. You will find fewer Stack Overflow answers and third-party tutorials.**2. Cloud dependency for easiest path.** While self-hosting is supported, the smoothest experience comes from using Superagent Cloud. Teams with strict data residency requirements may need to invest more setup time.**3. Limited to OpenAI-compatible APIs.** If you use a proprietary model with a custom interface (not OpenAI-compatible), you may need to write a compatibility shim.**4. Workflow debugging can be opaque.** When multi-step workflows fail, error tracing across agent boundaries is not as transparent as single-agent execution. Plan for careful logging.**5. Pricing can surprise at scale.** The per-token model for Guard/Verify/Redact adds up. A high-traffic application processing millions of tokens daily should model costs carefully before committing.---## Frequently Asked Questions### What is the difference between Superagent and LangChain?LangChain is a library for composing LLM applications. Superagent is a deployment framework that uses LangChain concepts but adds the API layer, vector DB management, and hosting. Think of LangChain as the engine and Superagent as the car around it.### Can I use Superagent with local models like Llama or Mistral?Yes. Any model exposed through an OpenAI-compatible API works, including [Ollama](https://ollama.com), [vLLM](https://github.com/vllm-project/vllm), and [LM Studio](https://lmstudio.ai). Set the `base_url` to your local inference server endpoint.### Is Superagent suitable for production workloads?Yes, with the right setup. Use the ```
pytho
n
from superagent.client import Superagent
import requests
client = Superagent()
def get_stock_price(symbol: str) -> str:
"""Fetch real-time stock price from a financial API."""
resp = requests.get(
f"https://api.example.com/stocks/{symbol}",
headers={"Authorization": f"Bearer {API_KEY}"}
)
data = resp.json()
return f"{symbol}: ${data['price']} (change: {data['change']})"
# Register the custom tool
client.tool.create(
name="stock_price",
description="Get the current stock price for a given ticker symbol",
function=get_stock_price
)
```r
the MIT license. Self-hosting requires Docker, PostgreSQL, and Redis. The CLI works against your self-hosted instance — just point it with `superagent config set api.url=https://your-instance.com`.### Does Superagent support multi-language document processing?Yes. The document chunking and embedding pipeline supports Unicode text in any language. For RAG over non-English documents, ensure your embedding model (e.g., `text-embedding-3-large`) supports the target language.### What vector databases are supported?As of v0.4.x: Pinecone, Weaviate, Qdrant, Chroma, and PostgreSQL with pgvector. Support for Milvus and Redis Vector is on the [roadmap](https://github.com/superagent-ai/superagent/i```
pytho
n
from superagent.client import Superagent
client = Superagent()
# Option 1: Conversation buffer (default, sliding window)
agent = client.agent.create(
name="Chat Agent",
memory={"type": "conversation_buffer", "k": 10}
)
# Option 2: Vector memory (semantic retrieval of past turns)
agent = client.agent.create(
name="Long Context Agent",
memory={"type": "vector_memory", "vector_db": "pinecone"}
)
# Option 3: Redis-backed session memory (for multi-user apps)
agent = client.agent.create(
name="Multi-User Agent",
memory={"type": "redis", "ttl": 3600} # 1-hour TTL
)
```i
n
our [Telegram group](https://t.me/dibi8ai) — we troubleshoot, share configs, and review agent architectures every week.---## Sources & Further Reading- [Superagent GitHub Repository](https://github.com/superagent-ai/superagent)
- [Superagent Official Documentation](https://docs.superagent.sh)
- [Superagent Python SDK Reference](https://docs.superagent.sh/api-reference)
- [Pinecone Documentation for RAG](https://docs.pinecone.io)
- [Weaviate Vector Database Docs](https://weaviate.io/developers)
- [OpenAI API Reference](https://platform.openai.com/docs)
- [LangChain Documentation](https://python.langchain.com)---
## Recommended Hosting & InfrastructureBe```
pytho
n
from superagent.client import Superagent
client = Superagent()
# Define a content generation workflow
workflow = client.workflow.create(
name="Blog Post Pipeline",
steps=[
{
"agent": "research-agent",
"input": "Research the topic: {{topic}}",
"output_key": "research_notes"
},
{
"agent": "writer-agent",
"input": "Write a blog post based on: {{research_notes}}",
"output_key": "draft"
},
{
"agent": "editor-agent",
"input": "Review and improve: {{draft}}",
"output_key": "final_post"
}
]
)
# Execute the workflow
result = client.workflow.invoke(
workflow_id=workflow.id,
inputs={"topic": "AI Agent Deployment Best Practices"}
)
print(result.steps[-1].output) # The final edited post
```. Superagent itself is open-source and free to use under the MIT license.<!--auto-references-->
## References & Sources- [Superagent](https://github.com/superagent-ai/superagent)
- [Pinecone](https://docs.pinecone.io)
- [Weaviate](https://weaviate.io/developers)
- [Qdrant](https://github.com/qdrant/qdrant)
- [Chroma](https://github.com/chroma-core/chroma)
- [pgvector](https://github.com/pgvector/pgvector)
- [Ollama](https://ollama.com)
- [vLLM](https://github.com/vllm-project/vllm)
- [LM Studio](https://lmstudio.ai)
- [LangChain](https://python.langchain.com)
- [AutoGen](https://github.com/microsoft/autogen)
- [CrewAI](https://github.com/crewAIInc/crewAI)
- [FastAPI](https://fastapi.tiangolo.com)
- [OpenAI API](https://platform.openai.com/docs)
```pyt
h
o
n
# Configure API key authentication
superagent config set auth.type=api_key
superagent config set auth.rate_limit=100/minute
# Enable request logging for audit trails
superagent config set logging.level=info
superagent config set logging.retention=30d
bas h
Built-in health endpoint #curl https://your-superagent-instance.com/health
Expected response: #
{“status”: “ok”, “version”: “0.4.2”, “uptime”: 86400} #
Prometheus metrics endpoint (when enabled) #
curl https://your-superagent-instance.com/metrics
💬 留言讨论