LangChain: 3 Cach Trien Khai AI Agent San Sang Production voi

LangChain (LC) la framework Python/JS de xay dung ung dung LLM voi 700+ tich hop. Hoc cach cai dat LangChain, trien khai voi Docker, tich hop voi OpenAI, Anthropic, Ollama, va mo rong production voi LangSmith, LangGraph agents, va Kubernetes.

  • MIT
  • Cập nhật 2026-05-19

apiVersion: v1 kind: Service metadata: name: langchain-service spec: selector: app: langchain-app ports: - protocol: TCP port: 80 targetPort: 8000 type: ClusterIP


```b
a
s
h
# Deploy to Kubernetes
kubectl apply -f k8s-deployment.yaml
kubectl get pods -l app=langchain-app
kubectl logs -f deployment/langchain-app

Redis Caching for Frequent Queries #

h
o
n
import redis
import json
import hashlib
from langchain.globals import set_llm_cache
from langchain_community.cache import RedisCache

# Connect to Redis
redis_client = redis.Redis.from_url("redis://localhost:6379")
set_llm_cache(RedisCache(redis_client=redis_client))

# Cache key based on input hash
def get_cache_key(prefix: str, text: str) -> str:
    hash_val = hashlib.md5(text.encode()).hexdigest()
    return f"{prefix}:{hash_val}"

# Check cache before expensive LLM call
def cached_invoke(chain, inputs: dict, ttl: int = 3600):
    cache_key = get_cache_key("llm", json.dumps(inputs, sort_keys=True))
    cached = redis_client.get(cache_key)
    if cached:
        return json.loads(cached)

    result = chain.invoke(inputs)
    redis_client.setex(cache_key, ttl, json.dumps({"output": result.content}))
    return result

Comparison with Alternatives #

Feature LangChain LlamaIndex Haystack Semantic Kernel
Primary Focus Multi-step workflows, agent orchestration Document indexing, retrieval optimization Semantic search, RAG pipelines Enterprise integration, Microsoft ecosystem
Language Support Python, TypeScript Python, TypeScript Python C#, Python, Java
GitHub Stars 137,165 39,200 17,900 26,300
License MIT MIT Apache-2.0 MIT
Integrations 700+ 300+ 150+ 80+
RAG Performance Good Excellent Good Moderate
Agent Capabilities Advanced (LangGraph) Basic Limited Moderate (Planner)
Observability LangSmith (native) Basic callbacks + LangSmith Built-in pipeline viz Azure Monitor
Time to RAG 2-3 days 1-2 days 3-5 days 4-7 days
Production Maturity LTS 1.0 (Oct 2025) Pre-1.0, stable 1.0+ stable 1.0+ stable
Cost (Framework) Free Free Free Free
Managed Cloud LangSmith $39/user/mo LlamaCloud usage-based deepset Cloud custom Azure AI Services
Learning Curve Moderate Gentle Moderate Moderate
Human-in-the-Loop Native (LangGraph) Limited Basic Via Azure Logic Apps
Best For Complex agents, multi-tool workflows Document Q&A, knowledge bases Enterprise search, compliance Microsoft shops, .NET teams

Limitations / Honest Assessment #

Not the fastest for pure retrieval. If your use case is exclusively document search and retrieval, LlamaIndex or Haystack will outperform LangChain on latency and accuracy benchmarks. LangChain’s strength is orchestration, not raw retrieval speed.

Steeper learning curve for simple use cases. A basic “chat with PDF” app requires understanding loaders, splitters, embeddings, vector stores, and chains. Tools like RAGFlow or Verba offer faster paths for non-developers.

Rapid evolution creates version drift. Despite the 1.0 LTS promise, the ecosystem moves fast. Community integrations (langchain-community) can introduce breaking changes on minor releases. Pin exact versions in production.

LangSmith costs scale with usage. The free tier covers 5,000 traces monthly — enough for prototyping but not production. A 5-person team processing 100,000 traces monthly pays approximately $220/month for LangSmith alone, excluding LLM API costs.

Over-engineering risk. LangChain’s flexibility tempts developers to build complex agent graphs where a simple prompt + API call would suffice. Start simple, add complexity only when justified by metrics.

Limited C# and Java ecosystem. Teams in Microsoft-centric environments may find Semantic Kernel’s first-class .NET support more natural than LangChain’s Python-first approach.

Frequently Asked Questions #

What is the difference between LangChain and LangGraph? #

LangChain is the core framework for building LLM applications with chains, prompts, and model integrations. LangGraph is an extension library that adds graph-based orchestration for complex agent workflows with cycles, branching, and human-in-the-loop approvals. Think of LangChain as the component library and LangGraph as the workflow engine. Both are maintained by LangChain Inc and share the same release cycle.

How do I switch between LLM providers in LangChain? #

Change the model class import. LangChain’s standardized BaseChatModel interface means code written for OpenAI works with Anthropic, Google, Ollama, or any supported provider with minimal changes. The .content_blocks property in 1.0+ standardizes message formats across all providers, eliminating provider-specific parsing code.

Is LangChain free for commercial use? #

Yes. LangChain is MIT licensed and free for commercial and personal use. The core framework, LangGraph, and all community integrations carry no licensing fees. LangSmith (the observability platform) offers a free tier with 5,000 traces monthly; paid plans start at $39 per user per month. LLM API costs from OpenAI, Anthropic, or other providers are billed separately.

For production deployments, use Docker containers with a WSGI/ASGI server (Uvicorn or Gunicorn), Redis for caching and session state, a vector store (Chroma for small scale, Pinecone or Weaviate for large scale), and LangSmith for observability. Deploy on Kubernetes for horizontal scaling. Set resource limits, health checks, and rate limiting. Pin all dependency versions and run evaluations before each deployment.

How does LangChain handle errors and retries? #

LangChain provides built-in retry logic with exponential backoff through the max_retries parameter on model classes. For production, wrap critical paths with Tenacity for fine-grained control over retry policies. Use structured exception handling to distinguish between retriable errors (rate limits, timeouts) and terminal errors (invalid inputs, authentication failures). Log all failures to LangSmith for post-incident analysis.

Can I self-host LangSmith? #

Self-hosted LangSmith is available only on Enterprise plans with custom pricing. For teams requiring on-premises observability, open-source alternatives include Langfuse (MIT license), Phoenix by Arize (free), and Helicone (open source). These integrate with LangChain via OpenTelemetry or direct callbacks.

How do I scale LangChain agents to handle 1000+ concurrent users? #

Scale horizontally by running multiple container instances behind a load balancer. Use async patterns (ainvoke, astream) to maximize throughput per worker. Implement Redis caching for frequently asked queries. Set up connection pooling for databases and external APIs. Monitor token usage and costs per request via LangSmith. Consider using a queue system (Celery, RQ) for long-running agent tasks rather than synchronous HTTP requests.

Conclusion #

LangChain Integration Map

LangChain’s 137,000 GitHub stars reflect its position as the default framework for production LLM applications. The 1.0 LTS release brought the stability that enterprise deployments demand: semantic versioning, standardized interfaces, and guaranteed backward compatibility. This guide covered how to install LangChain, containerize with Docker, deploy on Kubernetes, and harden with observability and error handling.

Your next steps:

  1. Clone the LangChain repository and run the quickstart
  2. Deploy a Docker container with your first agent using the Dockerfile and compose file above
  3. Set up LangSmith tracing to establish observability baselines before going live
  4. Join the LangChain community on Discord and Telegram AI Dev Group for production deployment discussions

Before you deploy any of the tools above into production, you’ll need solid infrastructure. Two options dibi8 actually uses and recommends:

  • DigitalOcean — $200 free credit for 60 days across 14+ global regions. The default option for indie devs running open-source AI tools.
  • HTStack — Hong Kong VPS with low-latency access from mainland China. This is the same IDC that hosts dibi8.com — battle-tested in production.

Affiliate links — they don’t cost you extra and they help keep dibi8.com running.

Sources & Further Reading #

References & Sources #

💬 Bình luận & Thảo luận