📄️ Quick Start
Quick start CLI, Config, Docker
📄️ Proxy Config.yaml
Set model list, apibase, apikey, temperature & proxy server settings (master-key) on the config.yaml.
📄️ Use with Langchain, OpenAI SDK, Curl
Input, Output, Exceptions are mapped to the OpenAI format for all supported models
📄️ Load Balancing - Multiple Instances of 1 model
Load balance multiple instances of the same model
📄️ Key Management
Track Spend, Set budgets and create virtual keys for the proxy
📄️ 💰 Budgets, Rate Limits per user
Requirements:
📄️ [BETA] Self-serve UI
Allow your users to create their own keys through a UI
📄️ Model Management
Add new models + Get model info without restarting proxy.
📄️ Fallbacks, Retries, Timeouts, Cooldowns
If a call fails after num_retries, fall back to another model group.
📄️ Caching
Cache LLM Responses
📄️ 🔎 Logging - Custom Callbacks, Langfuse, s3 Bucket, Sentry, OpenTelemetry
Log Proxy Input, Output, Exceptions using Custom Callbacks, Langfuse, OpenTelemetry, LangFuse, DynamoDB, s3 Bucket
📄️ Health Checks
Use this to health check all LLMs defined in your config.yaml
📄️ Modify Incoming Data
Modify data just before making litellm completion calls call on proxy
📄️ Post-Call Rules
Use this to fail a request based on the output of an llm api call.
📄️ Slack Alerting
Get alerts for failed db read/writes, hanging api calls, failed api calls.
📄️ Track Token Usage
Step 1 - Create your custom litellm callback class
📄️ 🐳 Docker, Deploying LiteLLM Proxy
You can find the Dockerfile to build litellm proxy here
📄️ CLI Arguments
Cli arguments, --host, --port, --num_workers