Reference

#Error handling

Error format alignment OpenAI: A single `error` object contains three fields: `message` / `type` / `code`. Common codes: `invalid_api_key` (401), `insufficient_balance` (402), `rate_limit_exceeded` (429), `tenant_monthly_quota_exceeded` (429), `upstream_error` (502). Upstream 5xx we will retry transparently; you will only see the final error if all retries fail.

json

{
  "error": {
    "message": "Account balance depleted. Please top up to continue.",
    "type": "insufficient_balance",
    "code": "account_suspended"
  }
}

#rate limit

Default 60 RPM per key. If the limit is exceeded, 429 is returned, with `X-RateLimit-Remaining` and `X-RateLimit-Reset` headers attached. The enterprise plan can relax the upper limit - contact us for customization.

#response header

Each response comes with useful metadata headers:

X-Trace-ID	unique request ID, include it in support tickets
X-Usage-Input-Tokens	input tokens counted for billing
X-Usage-Output-Tokens	output tokens counted for billing
X-RateLimit-Remaining	remaining requests in current window
X-RateLimit-Reset	seconds until window resets

#price

All internal models have a unified flat price: input $3.00 / million tokens, output $12.00 / million tokens. Cache hits (exact + semantic) are charged at 25% of regular price. The cost of retries and hedging paths is absorbed internally by us - you only pay for the answers you end up seeing.

Pricing page

#Error handling

#rate limit

#response header

#price

Next step

Get API Key

Browse the model library

Cookbook example