OpenAI /v1/responses

POST /v1/responses is the OpenAI Responses compatible entry point added in cc-router v2.3+. The design intent: let Codex CLI and any OpenAI Responses style client reuse every upstream subscription through cc-router without writing a second adapter.

Applies to cc-router v3.0.0 and later.

Protocol position and translation flow

Client (OpenAI Responses)                            cc-router                         Upstream
────────────────────────                  ─────────────────────────────────         ─────────────
POST /v1/responses ─────────────────────►│ handler::responses                  │
  body: OpenAI Responses format          │   ↓ request_to_anthropic            │
                                         │ Anthropic Messages format           │
                                         │   ↓ pipeline::dispatch              │──► pick sub / rewrite model
                                         │ pipeline returns Anthropic SSE/JSON │◄── upstream response
                                         │   ↓ translate_*_to_responses        │
                                         │ OpenAI Responses format             │
◄──────────────────── HTTP response──────│                                     │

The dispatch pipeline is untouched — every upstream provider path (including the codex / openai / gemini / kiro providers that already do protocol translation) is reused as-is. /v1/responses just wraps an inbound translator around the same pipeline.

Request

POST /v1/responses
Content-Type: application/json

Header	Required	Notes
`Content-Type: application/json`	Yes	The body must be JSON
`x-api-key` or `Authorization: Bearer ...`	Per auth settings	`/v1/responses` is not allowlisted — required once auth is enabled
Other OpenAI SDK headers	No	Not consumed; not forwarded upstream (upstream protocol is Anthropic)

Auth caveat: the same token check as /v1/messages, but the 401 error body is still Anthropic-shaped (auth_layer runs before the handler). Only 4xx/5xx produced inside the responses handler use the OpenAI shape.

The request body is a subset of the OpenAI Responses API. cc-router consumes and translates these fields:

Field	Type	Required	Behavior
`model`	string	Yes	Resolved to a virtual model; supports `gpt-5.5 / gpt-5.4 / gpt-5.4-mini` aliases. Missing this returns `400`
`stream`	boolean	No (defaults to `false`)	`true` runs SSE translation; `false` runs JSON translation
`instructions`	string	No	Translated to Anthropic `system`; if `input` also carries developer/system role text, the two are concatenated
`input`	array	Yes	Flat item stream (`message` / `function_call` / `function_call_output` / `reasoning`), translated into Anthropic `messages`
`max_output_tokens`	integer	No	Translated to Anthropic `max_tokens` (defaults to `4096` when omitted — Anthropic requires it)
`reasoning.effort`	string	No	Translated to Anthropic `thinking { type: enabled, budget_tokens }` with budget mapping: `minimal → 0` / `low → 1024` / `medium → 8192` / `high → 16384`
`tools`	array	No	Translated to Anthropic tool schema
`tool_choice`	object	No	Translated to Anthropic `tool_choice`

Non-streaming request example

curl http://127.0.0.1:23456/v1/responses \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "gpt-5.4",
    "max_output_tokens": 256,
    "instructions": "You are concise.",
    "input": [
      { "type": "message", "role": "user",
        "content": [{ "type": "input_text", "text": "Explain cc-router in one sentence" }] }
    ]
  }' | jq

Response (non-streaming)

200 OK, Content-Type: application/json
The body is the standard OpenAI Responses response JSON
- cc-router translates the Anthropic message returned by the pipeline into OpenAI Responses form
- output[] is flattened: text content_block → message item; tool_use → function_call item; thinking → reasoning item (encrypted_content round-trips through the Anthropic signature)
- usage.input_tokens / output_tokens are translated to OpenAI field names
- stop_reason is translated to status / incomplete_details (e.g. max_tokens → incomplete_details.reason = max_output_tokens)

{
  "id": "resp_xxx",
  "object": "response",
  "created_at": 1767225600,
  "status": "completed",
  "model": "gpt-5.5",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [{ "type": "output_text", "text": "..." }]
    }
  ],
  "usage": { "input_tokens": 42, "output_tokens": 128, "total_tokens": 170 }
}

Response (streaming SSE)

200 OK, Content-Type: text/event-stream (only this header is set — cache-control / transfer-encoding are not, so axum manages chunked encoding to avoid IncompleteMessage on HTTPS+rustls paths)
The stream is the OpenAI Responses SSE protocol, produced by an internal converter translating from upstream Anthropic SSE in real time.

Event mapping

Anthropic event	Translated to OpenAI Responses event
`message_start`	`response.created` + `response.in_progress`
`content_block_start (text)`	`response.output_item.added` + `response.content_part.added`
`content_block_delta (text_delta)`	`response.output_text.delta`
`content_block_start (thinking)`	`response.output_item.added` (reasoning item)
`content_block_delta (thinking_delta)`	`response.reasoning_summary_text.delta`
`content_block_start (tool_use)`	`response.output_item.added` (function_call item)
`content_block_delta (input_json_delta)`	`response.function_call_arguments.delta`
`content_block_stop`	`response.content_part.done` + `response.output_item.done`
`message_delta`	(usage extracted side-channel; no direct frame emitted)
`message_stop`	`response.completed`
Upstream disconnect / missing `message_stop`	A best-effort `response.completed` is emitted

Key differences from Anthropic SSE:

No data: [DONE] — OpenAI Responses clients terminate on response.completed
Disconnect safety net: when the upstream transport drops or the upstream omits message_stop, cc-router still emits a response.completed so the OpenAI SDK does not hang waiting for a terminal event

Streaming request example

curl -N http://127.0.0.1:23456/v1/responses \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "gpt-5.4",
    "stream": true,
    "max_output_tokens": 256,
    "input": [
      { "type": "message", "role": "user",
        "content": [{ "type": "input_text", "text": "ping" }] }
    ]
  }'

The stream terminates with event: response.completed (no data: [DONE]).

Dispatch

Identical to /v1/messages — /v1/responses resolves the client’s model field to a virtual model and runs the same dispatch + retry logic.

That means:

gpt-5.5 / gpt-5.4 / gpt-5.4-mini and model-sonnet / claude-sonnet-4-6 and any other virtual model alias can all enter via /v1/responses
Upstreams can be any of the 9 Anthropic providers or codex / openai / gemini / kiro — clients can’t tell
/v1/responses cannot bind the OAuth-only codex provider as fallback (same constraint as /v1/messages)

See the full table at Anthropic /v1/messages → Virtual model mapping.

Error responses

Errors produced inside /v1/responses follow the OpenAI Responses shape:

{
  "error": {
    "message": "...",
    "type": "<kind>",
    "code": null
  }
}

Status	Trigger	Source
`400`	Request body is not valid JSON	`handler::responses`
`400`	Missing `model` field	`handler::responses`
`400`	OpenAI → Anthropic translation failed (e.g. malformed `input`)	`request_to_anthropic`
`401`	Auth failure	Anthropic-shaped (auth_layer runs before the handler)
`500`	Internal pipeline error	`handler::responses`
`500`	Failed to read pipeline JSON body / failed to parse upstream response	`translate_json_to_responses`
Upstream status passthrough	The pipeline’s Anthropic error body is auto-translated to OpenAI shape	`translate_json_to_responses`

Unimplemented OpenAI endpoints

The following official OpenAI endpoints are not implemented in cc-router by design:

POST /v1/chat/completions (Chat Completions API) — cc-router’s OpenAI compatibility layer only does Responses API
GET /v1/responses/{id}, POST /v1/responses/{id}/cancel, and other Responses state endpoints — cc-router is a stateless proxy; it doesn’t cache responses, so clients terminate on the streaming response.completed
OpenAI Assistants / Threads / Files API

cc-router’s intended clients are Codex CLI / Claude Code style stateless conversation clients that only need POST /v1/responses or POST /v1/messages plus GET /v1/models.