Features

Auto Routing

Pass model: "auto" and Auraon will select the best model for your request based on task type, cost, and latency.

How it works

Auraon uses a lightweight classifier to analyze your prompt, then scores all available models based on:

Task complexity
Simple Q&A vs. complex reasoning vs. code generation
Cost target
Your account's per-request cost preference
Latency
Whether you need fast responses or can wait for higher quality
Context length
Long documents route to models with large context windows
Availability
Automatic fallback when provider error rates spike

Routing modes

auto

Balanced: best quality/cost ratio. Default.

auto:fast

Prioritize lowest latency. Ideal for chat UIs.

auto:quality

Prioritize highest quality. Ideal for document analysis.

auto:cheap

Prioritize lowest cost. Ideal for batch processing.

Routing metadata

Every response includes a blockrun field showing which model was used, latency, and cost:

json

{
  "choices": [...],
  "auraon": {
    "routed_to": "claude-opus-4",
    "routing_reason": "complex_reasoning",
    "latency_ms": 821,
    "cost_usd": 0.00038
  }
}

Custom routing rules

Pro and Enterprise plans can define custom routing rules in the dashboard. For example: “Always use deepseek-r1 for math problems.”

json

// Custom rule example (set in dashboard)
{
  "rule": "regex_match",
  "pattern": "(python|javascript|typescript|code|function)",
  "route_to": "claude-sonnet-4-6"
}