A
Auraon
/Docs
Features

Auto Routing

Pass model: "auto" and Auraon will select the best model for your request based on task type, cost, and latency.

How it works

Auraon uses a lightweight classifier to analyze your prompt, then scores all available models based on:

  • Task complexity

    Simple Q&A vs. complex reasoning vs. code generation

  • Cost target

    Your account's per-request cost preference

  • Latency

    Whether you need fast responses or can wait for higher quality

  • Context length

    Long documents route to models with large context windows

  • Availability

    Automatic fallback when provider error rates spike

Routing modes

auto

Balanced: best quality/cost ratio. Default.

auto:fast

Prioritize lowest latency. Ideal for chat UIs.

auto:quality

Prioritize highest quality. Ideal for document analysis.

auto:cheap

Prioritize lowest cost. Ideal for batch processing.

Routing metadata

Every response includes a blockrun field showing which model was used, latency, and cost:

json
{
  "choices": [...],
  "auraon": {
    "routed_to": "claude-opus-4",
    "routing_reason": "complex_reasoning",
    "latency_ms": 821,
    "cost_usd": 0.00038
  }
}

Custom routing rules

Pro and Enterprise plans can define custom routing rules in the dashboard. For example: “Always use deepseek-r1 for math problems.”

json
// Custom rule example (set in dashboard)
{
  "rule": "regex_match",
  "pattern": "(python|javascript|typescript|code|function)",
  "route_to": "claude-sonnet-4-6"
}