Request rate by model #

rate(litellm_request_total_requests[5m])

Error rate #

rate(litellm_requests_total_failed[5m])

Remaining budget per key #

litellm_remaining_requests

Gateway overhead histogram #

histogram_quantile(0.95, litellm_overhead_latency_ms_bucket)

- [LiteLLM Documentation](https://docs.litellm.ai/docs/) — Complete proxy and SDK reference
- [LiteLLM Docker Quick Start](https://docs.litellm.ai/docs/proxy/docker_quick_start) — Official Docker setup guide
- [LiteLLM Config Reference](https://docs.litellm.ai/docs/proxy/configs) — All config.yaml options
- [LiteLLM Helm Deployment](https://docs.litellm.ai/docs/proxy/deploy) — Kubernetes and Helm charts
- [LiteLLM Admin UI Docs](https://docs.litellm.ai/docs/proxy/ui) — Virtual key and team management
- [LiteLLM Caching Guide](https://docs.litellm.ai/docs/caching/all_caches) — Redis, semantic, and disk caching
- [Portkey vs LiteLLM Comparison](https://portkey.ai/lp/portkey-vs-litellm) — Vendor comparison page
- [OpenRouter Documentation](https://openrouter.ai/docs) — Alternative gateway reference
- [Helicone Documentation](https://docs.helicone.ai) — Observability-focused alternative

💬 Discussion