Chat Completions

POST /v1/chat/completions is the core endpoint. It takes a list of messages and returns a model completion. Recovea proxies OpenAI's request and response shapes unchanged (the same fields, in the same order, with the same types), so an unmodified OpenAI SDK works as-is. We optimize and meter underneath, fail-open.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.recovea.ai/v1",
    api_key="rcv_live_…",
)

resp = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a terse assistant."},
        {"role": "user", "content": "Name the capital of France."},
    ],
)
print(resp.choices[0].message.content)

Request

FieldTypeNotes
modelstringRequired. A bare OpenAI id (e.g. gpt-4o) goes native to OpenAI; an OpenRouter vendor/model id (e.g. google/gemini-2.5-pro, exactly as listed by GET /v1/models) rides your OpenRouter key for the long tail. A bare non-OpenAI id returns a 404 naming the vendor/model id to use. For Claude models the native Anthropic-shaped /anthropic surface is also available. Echoed back in the response.
messagesarrayRequired. Conversation so far. Each item has a role and content (see below).
temperaturenumberSampling temperature, 02. Default 1.
top_pnumberNucleus sampling, 01. Use this or temperature, not both.
max_tokensintegerCap on tokens generated in the completion.
nintegerNumber of choices to return. Default 1.
stopstring | arrayUp to four sequences that halt generation.
toolsarrayFunction/tool definitions the model may call. Passes through unchanged.
tool_choicestring | objectauto, none, required, or a named function.
response_formatobject{"type": "json_object"} or a JSON schema for structured output.
streambooleanWhen true, tokens arrive as server-sent events. See Streaming.

This is a subset; any other field the OpenAI Chat Completions API accepts (presence_penalty, frequency_penalty, seed, logprobs, logit_bias, stream_options, user, …) is forwarded verbatim.

Two surfaces, one account. This OpenAI-shaped /v1 endpoint selects the provider from the model id: a bare OpenAI id goes native to OpenAI on your connected key, and an OpenRouter vendor/model id rides your OpenRouter key for the long tail. Anthropic is served natively on its own Anthropic-shaped /anthropic surface (same rcv_ key, same ledger). Cross-provider cascading — falling back to a cheaper model that still passes the quality gate — is planned, not yet active on the hot path. See How Recovea works.

Messages and roles

Each message carries a role and content:

  • system: instructions that steer the assistant.
  • user: end-user input.
  • assistant: a prior model turn. May include tool_calls.
  • tool: the result of a tool call, keyed by tool_call_id.

content is a string, or an array of content parts ({"type": "text", …}, {"type": "image_url", …}) for multimodal input.

Response

A non-streamed call returns a chat.completion object. Field-for-field identical to OpenAI:

{
  "id": "chatcmpl-9x8a7b6c5d",
  "object": "chat.completion",
  "created": 1717000000,
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Paris.",
        "refusal": null
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 19,
    "completion_tokens": 2,
    "total_tokens": 21
  }
}
FieldNotes
idStable chatcmpl-… id for the response.
objectAlways the literal "chat.completion".
createdUnix timestamp in seconds.
modelThe model that served the request: the id you sent, echoed back.
choices[]One entry per n, each with index, message, logprobs, finish_reason.
usageToken accounting: prompt_tokens, completion_tokens, total_tokens.

finish_reason

The reason the model stopped, preserved exactly:

ValueMeaning
stopNatural end, or hit a stop sequence.
lengthReached max_tokens or the context limit.
tool_callsThe model is calling one or more tools.
content_filterContent was flagged and omitted.

Tool / function calling

tools, tool_choice, and the legacy functions / function_call fields pass through unchanged in both directions. When the model decides to call a tool, finish_reason is tool_calls and message.tool_calls is populated:

{
  "message": {
    "role": "assistant",
    "content": null,
    "tool_calls": [
      {
        "id": "call_abc123",
        "type": "function",
        "function": {
          "name": "get_weather",
          "arguments": "{\"city\":\"Paris\"}"
        }
      }
    ]
  },
  "finish_reason": "tool_calls"
}

You then append a tool message keyed by tool_call_id and call again. It's the standard OpenAI loop, with no Recovea-specific changes.

Headers

Every response carries two Recovea-added ids:

x-request-id:       req_…          (Recovea-generated; also your metering correlation id)
x-recovea-trace-id: trc_3f9a…c21   (correlates the call to your cost ledger)

Both are safe to log or ignore. On an exact-cache hit Recovea also adds x-recovea-cache: hit (absent otherwise). For rate limits, Recovea mints the six x-ratelimit-* headers and Retry-After / retry-after-ms on its own throttle responses; on a response proxied from your provider, the provider's own rate-limit headers pass through verbatim and Recovea fills these in only when the upstream sent none, never overwriting them. Beyond these additive headers the response is byte-for-byte your provider's.

Streaming and errors

  • Set stream: true to receive tokens as server-sent events. The chunk format, usage handling, and data: [DONE] terminator are documented in Streaming.
  • Failures return the standard OpenAI error envelope with matching status codes. See Errors.