API Reference

Auraon's API is fully compatible with the OpenAI REST API. Base URL: https://api.auraon.ai/v1

Authentication

Pass your Auraon API key as a Bearer token in the Authorization header. API keys start with br-.

bash

Authorization: Bearer br-your-api-key

POST /v1/chat/completions

Create a chat completion. Compatible with OpenAI's chat.completions.create.

bash

curl https://api.auraon.ai/v1/chat/completions \
  -H "Authorization: Bearer br-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": false
  }'

Request parameters

modelstringrequired

Model ID (e.g. "gpt-4o", "claude-opus-4") or "auto" for smart routing.

messagesarrayrequired

Array of message objects with role (system/user/assistant) and content.

streamboolean

If true, returns an SSE event stream. Default: false.

max_tokensinteger

Maximum number of tokens to generate.

temperaturenumber

Sampling temperature, 0 to 2. Default: 1.

top_pnumber

Nucleus sampling. Default: 1.

stopstring | array

Up to 4 sequences where generation stops.

toolsarray

List of tools (functions) the model can call.

Response

json

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1710000000,
  "model": "claude-opus-4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 10,
    "total_tokens": 22
  },
  "blockrun": {
    "routed_to": "claude-opus-4",
    "latency_ms": 423,
    "cost_usd": 0.00024
  }
}

GET /v1/models

List all available models. Returns a paginated list compatible with the OpenAI models endpoint.

bash

curl https://api.auraon.ai/v1/models \
  -H "Authorization: Bearer br-your-key"

Streaming

Set stream: true to receive Server-Sent Events (SSE). Each event is a data: JSON delta, terminated by data: [DONE].

python

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)
for chunk in stream:
    print(chunk.choices[0].delta.content, end="", flush=True)

Rate limits

Plan	Requests/min	Tokens/min
Starter	60	100K
Pro	1,000	2M
Enterprise	Unlimited	Unlimited

Error codes

400

Bad Request

Invalid request body or missing required fields.

401

Unauthorized

Invalid or missing API key.

429

Too Many Requests

Rate limit exceeded. Check the Retry-After header.

503

Service Unavailable

Selected model is temporarily unavailable. Enable fallbacks.