Commercial features
- Intelligent routing
- Context window compression
- Cluster mode
Roadmap to 0.2.0
- UI visibility for prompt caching and local cache usage
- Broader provider support, including Cohere, Command A, and Operational
- Full support for the OpenAI
/conversationslifecycle - Guardrails hardening: better UI, simpler architecture, easier custom guardrails, and response-side guardrails before output reaches the client
- Provider-native passthrough for all providers, beyond the current beta coverage
- Budget management with limits per
user_pathand/or API key - Editable model pricing for accurate cost tracking and budgeting
- Full support for the OpenAI
/responseslifecycle - Anthropic-compatible
/messagesingress and/messages/count_tokens - Prompt cache visibility showing how much of each prompt was cached by the provider
- Fix failover charts in the dashboard