LangChain: 3 Cach Trien Khai AI Agent San Sang Production voi
LangChain (LC) la framework Python/JS de xay dung ung dung LLM voi 700+ tich hop. Hoc cach cai dat LangChain, trien khai voi Docker, tich hop voi OpenAI, Anthropic, Ollama, va mo rong production voi LangSmith, LangGraph agents, va Kubernetes.
- MIT
- Cập nhật 2026-05-19
apiVersion: v1 kind: Service metadata: name: langchain-service spec: selector: app: langchain-app ports: - protocol: TCP port: 80 targetPort: 8000 type: ClusterIP
```b
a
s
h
# Deploy to Kubernetes
kubectl apply -f k8s-deployment.yaml
kubectl get pods -l app=langchain-app
kubectl logs -f deployment/langchain-app
Redis Caching for Frequent Queries #
h
o
n
import redis
import json
import hashlib
from langchain.globals import set_llm_cache
from langchain_community.cache import RedisCache
# Connect to Redis
redis_client = redis.Redis.from_url("redis://localhost:6379")
set_llm_cache(RedisCache(redis_client=redis_client))
# Cache key based on input hash
def get_cache_key(prefix: str, text: str) -> str:
hash_val = hashlib.md5(text.encode()).hexdigest()
return f"{prefix}:{hash_val}"
# Check cache before expensive LLM call
def cached_invoke(chain, inputs: dict, ttl: int = 3600):
cache_key = get_cache_key("llm", json.dumps(inputs, sort_keys=True))
cached = redis_client.get(cache_key)
if cached:
return json.loads(cached)
result = chain.invoke(inputs)
redis_client.setex(cache_key, ttl, json.dumps({"output": result.content}))
return result
Comparison with Alternatives #
| Feature | LangChain | LlamaIndex | Haystack | Semantic Kernel |
|---|---|---|---|---|
| — | ||||
| Primary Focus | Multi-step workflows, agent orchestration | Document indexing, retrieval optimization | Semantic search, RAG pipelines | Enterprise integration, Microsoft ecosystem |
| Language Support | Python, TypeScript | Python, TypeScript | Python | C#, Python, Java |
| GitHub Stars | 137,165 | 39,200 | 17,900 | 26,300 |
| License | MIT | MIT | Apache-2.0 | MIT |
| Integrations | 700+ | 300+ | 150+ | 80+ |
| RAG Performance | Good | Excellent | Good | Moderate |
| Agent Capabilities | Advanced (LangGraph) | Basic | Limited | Moderate (Planner) |
| Observability | LangSmith (native) | Basic callbacks + LangSmith | Built-in pipeline viz | Azure Monitor |
| Time to RAG | 2-3 days | 1-2 days | 3-5 days | 4-7 days |
| Production Maturity | LTS 1.0 (Oct 2025) | Pre-1.0, stable | 1.0+ stable | 1.0+ stable |
| Cost (Framework) | Free | Free | Free | Free |
| Managed Cloud | LangSmith $39/user/mo | LlamaCloud usage-based | deepset Cloud custom | Azure AI Services |
| Learning Curve | Moderate | Gentle | Moderate | Moderate |
| Human-in-the-Loop | Native (LangGraph) | Limited | Basic | Via Azure Logic Apps |
| Best For | Complex agents, multi-tool workflows | Document Q&A, knowledge bases | Enterprise search, compliance | Microsoft shops, .NET teams |
Limitations / Honest Assessment #
Not the fastest for pure retrieval. If your use case is exclusively document search and retrieval, LlamaIndex or Haystack will outperform LangChain on latency and accuracy benchmarks. LangChain’s strength is orchestration, not raw retrieval speed.
Steeper learning curve for simple use cases. A basic “chat with PDF” app requires understanding loaders, splitters, embeddings, vector stores, and chains. Tools like RAGFlow or Verba offer faster paths for non-developers.
Rapid evolution creates version drift. Despite the 1.0 LTS promise, the ecosystem moves fast. Community integrations (langchain-community) can introduce breaking changes on minor releases. Pin exact versions in production.
LangSmith costs scale with usage. The free tier covers 5,000 traces monthly — enough for prototyping but not production. A 5-person team processing 100,000 traces monthly pays approximately $220/month for LangSmith alone, excluding LLM API costs.
Over-engineering risk. LangChain’s flexibility tempts developers to build complex agent graphs where a simple prompt + API call would suffice. Start simple, add complexity only when justified by metrics.
Limited C# and Java ecosystem. Teams in Microsoft-centric environments may find Semantic Kernel’s first-class .NET support more natural than LangChain’s Python-first approach.
Frequently Asked Questions #
What is the difference between LangChain and LangGraph? #
LangChain is the core framework for building LLM applications with chains, prompts, and model integrations. LangGraph is an extension library that adds graph-based orchestration for complex agent workflows with cycles, branching, and human-in-the-loop approvals. Think of LangChain as the component library and LangGraph as the workflow engine. Both are maintained by LangChain Inc and share the same release cycle.
How do I switch between LLM providers in LangChain? #
Change the model class import. LangChain’s standardized BaseChatModel interface means code written for OpenAI works with Anthropic, Google, Ollama, or any supported provider with minimal changes. The .content_blocks property in 1.0+ standardizes message formats across all providers, eliminating provider-specific parsing code.
Is LangChain free for commercial use? #
Yes. LangChain is MIT licensed and free for commercial and personal use. The core framework, LangGraph, and all community integrations carry no licensing fees. LangSmith (the observability platform) offers a free tier with 5,000 traces monthly; paid plans start at $39 per user per month. LLM API costs from OpenAI, Anthropic, or other providers are billed separately.
What is the recommended deployment stack for LangChain in production? #
For production deployments, use Docker containers with a WSGI/ASGI server (Uvicorn or Gunicorn), Redis for caching and session state, a vector store (Chroma for small scale, Pinecone or Weaviate for large scale), and LangSmith for observability. Deploy on Kubernetes for horizontal scaling. Set resource limits, health checks, and rate limiting. Pin all dependency versions and run evaluations before each deployment.
How does LangChain handle errors and retries? #
LangChain provides built-in retry logic with exponential backoff through the max_retries parameter on model classes. For production, wrap critical paths with Tenacity for fine-grained control over retry policies. Use structured exception handling to distinguish between retriable errors (rate limits, timeouts) and terminal errors (invalid inputs, authentication failures). Log all failures to LangSmith for post-incident analysis.
Can I self-host LangSmith? #
Self-hosted LangSmith is available only on Enterprise plans with custom pricing. For teams requiring on-premises observability, open-source alternatives include Langfuse (MIT license), Phoenix by Arize (free), and Helicone (open source). These integrate with LangChain via OpenTelemetry or direct callbacks.
How do I scale LangChain agents to handle 1000+ concurrent users? #
Scale horizontally by running multiple container instances behind a load balancer. Use async patterns (ainvoke, astream) to maximize throughput per worker. Implement Redis caching for frequently asked queries. Set up connection pooling for databases and external APIs. Monitor token usage and costs per request via LangSmith. Consider using a queue system (Celery, RQ) for long-running agent tasks rather than synchronous HTTP requests.
Conclusion #
LangChain’s 137,000 GitHub stars reflect its position as the default framework for production LLM applications. The 1.0 LTS release brought the stability that enterprise deployments demand: semantic versioning, standardized interfaces, and guaranteed backward compatibility. This guide covered how to install LangChain, containerize with Docker, deploy on Kubernetes, and harden with observability and error handling.
Your next steps:
- Clone the LangChain repository and run the quickstart
- Deploy a Docker container with your first agent using the Dockerfile and compose file above
- Set up LangSmith tracing to establish observability baselines before going live
- Join the LangChain community on Discord and Telegram AI Dev Group for production deployment discussions
Recommended Hosting & Infrastructure #
Before you deploy any of the tools above into production, you’ll need solid infrastructure. Two options dibi8 actually uses and recommends:
- DigitalOcean — $200 free credit for 60 days across 14+ global regions. The default option for indie devs running open-source AI tools.
- HTStack — Hong Kong VPS with low-latency access from mainland China. This is the same IDC that hosts dibi8.com — battle-tested in production.
Affiliate links — they don’t cost you extra and they help keep dibi8.com running.
Sources & Further Reading #
- LangChain Official Documentation
- LangChain GitHub Repository
- LangGraph Documentation
- LangSmith Platform
- LangChain 1.0 Release Notes
- LangChain vs LlamaIndex Comparison — Latenode
- LangChain Docker Deployment Guide — DevOpsness
- Production AI Agents Guide — GroovyWeb
- LLM Monitoring Tools Comparison — Integrity Studio
- LangChain Versioning and Release Policy
- LangChain Pricing — CheckThat.ai
💬 Bình luận & Thảo luận