Auto Routing
Pass model: "auto" and Auraon will select the best model for your request based on task type, cost, and latency.
How it works
Auraon uses a lightweight classifier to analyze your prompt, then scores all available models based on:
- Task complexity
Simple Q&A vs. complex reasoning vs. code generation
- Cost target
Your account's per-request cost preference
- Latency
Whether you need fast responses or can wait for higher quality
- Context length
Long documents route to models with large context windows
- Availability
Automatic fallback when provider error rates spike
Routing modes
autoBalanced: best quality/cost ratio. Default.
auto:fastPrioritize lowest latency. Ideal for chat UIs.
auto:qualityPrioritize highest quality. Ideal for document analysis.
auto:cheapPrioritize lowest cost. Ideal for batch processing.
Routing metadata
Every response includes a blockrun field showing which model was used, latency, and cost:
{
"choices": [...],
"auraon": {
"routed_to": "claude-opus-4",
"routing_reason": "complex_reasoning",
"latency_ms": 821,
"cost_usd": 0.00038
}
}Custom routing rules
Pro and Enterprise plans can define custom routing rules in the dashboard. For example: “Always use deepseek-r1 for math problems.”
// Custom rule example (set in dashboard)
{
"rule": "regex_match",
"pattern": "(python|javascript|typescript|code|function)",
"route_to": "claude-sonnet-4-6"
}