Getting started
Overview
TokenRouter wraps the OpenAI Responses API with an intelligent routing layer. Send a standard Responses request to https://api.tokenrouter.io/v1/responses
and we will score eligible providers, enforce your firewall rules, and return a fully compatible payload.
The router monitors live provider pricing, historical latency, and your account limits so you always get the best trade-off for each prompt. You can adjust the strategy per-request, define deterministic rules in the console, or pin a provider when compliance requires it.
Authentication
Generate an API key from the Console → API Keyspage. Pass it as a bearer token: Authorization: Bearer TOKENROUTER_API_KEY
. Keys inherit your workspace quota and firewall settings, so rotate them regularly and keep them scoped to the environments you deploy.
All requests must be made over HTTPS. We automatically issue request IDs, idempotency keys, and latency telemetry so you can correlate traffic in the Logs
view.
OpenAI compatibility
TokenRouter mirrors the Responses schema, including tool calls, multimodal inputs, and SSE streaming. If you already use the official OpenAI SDKs, point them at https://api.tokenrouter.io/v1
and addOpenAI-Beta: responses=v1
to your default headers.
The router is drop-in compatible with OpenAI’s retry helpers, so you can keep your existing middleware. You’ll just gain routing decisions, provider telemetry, and cost controls without touching your prompts.
Responses API
The /v1/responses
endpoint accepts the same payloads as OpenAI. Below are the most common request patterns, each with matching examples for cURL, the TokenRouter SDKs, and the OpenAI SDKs.
Creating a standard request
Send a synchronous request to the Responses API. TokenRouter mirrors the OpenAI Responses schema, so anything that works against OpenAI works here.

curl https://api.tokenrouter.io/v1/responses \
-H "Authorization: Bearer $TOKENROUTER_API_KEY" \
-H "Content-Type: application/json" \
-H "OpenAI-Beta: responses=v1" \
-d '{
"model": "auto:balance",
"input": [
{
"role": "user",
"content": [
{ "type": "input_text", "text": "Summarize what TokenRouter does in one sentence." }
]
}
]
}'
Auto routing
Let TokenRouter pick the best provider for every request. Use metadata to give the router more context, or pin a preferred vendor when you need to.

curl https://api.tokenrouter.io/v1/responses \
-H "Authorization: Bearer $TOKENROUTER_API_KEY" \
-H "Content-Type: application/json" \
-H "OpenAI-Beta: responses=v1" \
-d '{
"model": "auto:quality",
"metadata": {
"workspace": "docs-demo",
"priority": "launch"
},
"input": [
{
"role": "user",
"content": [
{ "type": "input_text", "text": "Draft an onboarding email for new customers." }
]
}
]
}'
Modes
Balance cost, latency, or quality for each call. Use the `model` shortcut (`auto:latency`) or pass `router_mode` explicitly.

curl https://api.tokenrouter.io/v1/responses \
-H "Authorization: Bearer $TOKENROUTER_API_KEY" \
-H "Content-Type: application/json" \
-H "OpenAI-Beta: responses=v1" \
-d '{
"model": "auto:latency",
"router_mode": "latency",
"input": [
{
"role": "user",
"content": [
{ "type": "input_text", "text": "Respond quickly with a short acknowledgement." }
]
}
]
}'
Creating a stream request
Stream responses over Server-Sent Events. TokenRouter emits the same event types as OpenAI: metadata, message.start, content.delta, usage, and done.

curl https://api.tokenrouter.io/v1/responses \
-H "Authorization: Bearer $TOKENROUTER_API_KEY" \
-H "Content-Type: application/json" \
-H "Accept: text/event-stream" \
-H "OpenAI-Beta: responses=v1" \
-d '{
"model": "auto:balance",
"stream": true,
"input": [
{
"role": "user",
"content": [
{ "type": "input_text", "text": "Stream a short haiku about resilient infrastructure." }
]
}
]
}'
Creating a tool request
TokenRouter normalises tool definitions so every provider receives compatible JSON. Keep tool schemas tight to get the best results across vendors.

curl https://api.tokenrouter.io/v1/responses \
-H "Authorization: Bearer $TOKENROUTER_API_KEY" \
-H "Content-Type: application/json" \
-H "OpenAI-Beta: responses=v1" \
-d '{
"model": "auto:quality",
"tools": [
{
"type": "function",
"name": "lookup_customer",
"description": "Fetch CRM profile by email",
"parameters": {
"type": "object",
"properties": {
"email": { "type": "string", "format": "email" }
},
"required": ["email"],
"additionalProperties": false
}
}
],
"tool_choice": "auto",
"input": [
{
"role": "user",
"content": [
{ "type": "input_text", "text": "Find the CRM record for sara@acme.dev and suggest next steps." }
]
}
]
}'
Provider keys
Including via headers
Bring-your-own-provider keys per request to test new vendors without storing credentials server-side. Add the appropriate header and the router will prefer that provider when routing candidates are equivalent.
X-OpenAI-Key
X-Anthropic-Key
X-Gemini-Key
X-DeepSeek-Key
We automatically decrypt and compare headers against stored keys so you can rotate credentials without breaking traffic.
Configure in console
Navigate to Console → Provider Keys to add encrypted provider secrets. Choose a primary key per provider and mark fallbacks. Saved keys are surfaced to the router so calls succeed even when you do not send headers.
Enterprise plans can scope keys to projects and restrict traffic to specific regions for compliance.
Errors
Viewing errors
Failed calls return an error
object with a normalised structure:
{
"error": {
"type": "routing_error",
"message": "No eligible providers available for this request.",
"http_status": 422,
"provider": null,
"raw": null
}
}
Every response also includes request_id
so you can inspect the full trace inside Console → Logs.
Handling errors
SDKs raise typed exceptions (AuthenticationError
,QuotaExceededError
,APIStatusError
) so you can branch on retryable errors. When calling the API directly, check for 429
and back off usingRetry-After
headers.
Routing and firewall errors surface the offending rule in the rule_trace
metadata so you can tune matchers without guessing.
Rules (Console)
Routing rules run before the classifier to enforce business logic. Head to Console → Routing Rules to create weighted overrides—pin models for VIP customers, fall back to Anthropic for long contexts, or redirect traffic that triggers specific metadata.
Rules are ordered by priority. Each condition (input contains, metadata equals, mode equals, etc.) is logged in the rule trace so you can see exactly why a rule executed. Use the “Preview decision” drawer in the console to simulate routing without sending live traffic.
Firewall (Console)
The firewall inspects prompts before they reach any provider. Create policies underConsole → Firewall to block secrets, mask sensitive data, or warn analysts when compliance filters match.
Rules support substring and regex matching across prompt, response, or metadata scopes. Actions includewarn
,mask
, andblock
. Every applied rule is recorded alongside the response so auditors can review exactly what happened.