One API for your coding agents

Point Codex, Cursor, and any OpenAI-compatible client at one endpoint — backed by your own ChatGPT/Codex subscription, with per-request token accounting.

  • Routes across upstream accounts
  • Per-request token and cost accounting
  • OpenAI-compatible · Chat Completions + Responses
# Same request your agent already sends.
# Just point it at Relay.
curl https://relay.adxztech.com/v1/chat/completions \
  -H "Authorization: Bearer $RELAY_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "messages": [{"role":"user","content":"Refactor this handler"}],
    "reasoning_effort": "high",
    "stream": true
  }'

Point the OpenAI-compatible agents your team already runs at one base URL.

Reliability

Stays up when an upstream doesn't.

Requests route across multiple upstream accounts. When one degrades, Relay fails over without dropping your stream. No fabricated uptime badge, just the mechanics that keep calls flowing.

Smart routing across upstream accounts

Every call is scored and sent to a healthy account. Pool your own subscriptions and let Relay pick the best path per request.

Automatic failover

A 429 or 5xx on one account reroutes to the next without a client-side retry.

Streaming responses

Server-sent tokens pass straight through. Time-to-first-token stays low.

Live service status

A public status view for the gateway and its subsystems.

Sticky sessions

Pin a conversation to one upstream so multi-turn context stays coherent.

Concurrency control

Per-key limits smooth bursts and keep one runaway job from starving the rest.

Request retries

Transient failures retry with backoff before an error ever reaches your agent.

Transparency

Know exactly where every token goes.

Every request is logged with the model that actually ran, token counts, cost, and latency. Click any request to open the full breakdown. Same data in the dashboard and the API.

Example data. Numbers are illustrative.

Drop-in for the SDKs you already use.

Relay speaks both OpenAI protocols — Chat Completions and Responses. Change the base URL and the API key, keep everything else. The highlighted lines are the only edits.

OpenAI-compatible /v1/chat/completions
from openai import OpenAI

client = OpenAI(
    base_url="https://relay.adxztech.com/v1",
    api_key="sk-...",
)

client.chat.completions.create(
    model="gpt-5.4",
    reasoning_effort="high",
    messages=[{"role": "user", "content": "Add a null check"}],
)
OpenAI Responses /v1/responses
from openai import OpenAI

client = OpenAI(
    base_url="https://relay.adxztech.com/v1",
    api_key="sk-...",
)

client.responses.create(
    model="gpt-5.5",
    reasoning={"effort": "high"},
    input="Add a null check",
)

From zero to first token in 60 seconds.

Create a key, point your base URL, send a request. Pick your language and copy the three steps.

  1. 01

    Create a key

    # relay.adxztech.com > Keys > Create
    export RELAY_KEY="sk-..."
  2. 02

    Point your base URL

    from openai import OpenAI
    client = OpenAI(base_url="https://relay.adxztech.com/v1",
                    api_key=os.environ["RELAY_KEY"])
  3. 03

    Send a request

    resp = client.chat.completions.create(
        model="gpt-5.4",
        messages=[{"role": "user", "content": "Ship it"}],
        stream=True,
    )

API key management

Scoped keys, rotation, and per-key limits.

Usage dashboard

Spend, tokens, and latency over any window.

Error logs

Every failure with status, upstream, and reason.

Team quotas

Budgets per member and per project.

Clear docs

Reference, guides, and copy-paste recipes.

Python, JS, and cURL

Runnable examples for every endpoint.

Point your agents at one endpoint.

Get a key, change your base URL, and keep shipping.