Create Chat Completion

Generates a model response for a conversation. This endpoint is fully compatible with the OpenAI Chat Completions API format.

POST https://api.runcrate.ai/v1/chat/completions

Request Body

Parameter	Type	Required	Description
`model`	string	Yes	The model ID to use (e.g., `meta-llama/Meta-Llama-3.1-8B-Instruct`).
`messages`	array	Yes	An array of message objects representing the conversation.
`max_tokens`	integer	No	Maximum number of tokens to generate in the response.
`temperature`	number	No	Sampling temperature between 0 and 2. Higher values produce more random output. Default: `0.7`.
`top_p`	number	No	Nucleus sampling parameter. Only tokens with cumulative probability up to `top_p` are considered. Default: `1`.
`stream`	boolean	No	If `true`, partial responses are streamed as server-sent events. Default: `false`.

Messages Format

Each message object must include a role and content field:

Role	Description
`system`	Sets the behavior and persona of the assistant.
`user`	The user’s message or question.
`assistant`	A previous response from the assistant (for multi-turn context).

{
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "What is machine learning?" }
  ]
}

Vision Input

For models that support vision, you can pass images by providing content as an array of content parts:

{
  "model": "meta-llama/Llama-3.2-11B-Vision-Instruct",
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "What is in this image?" },
        {
          "type": "image_url",
          "image_url": {
            "url": "https://example.com/image.png"
          }
        }
      ]
    }
  ]
}

Example Request

curl https://api.runcrate.ai/v1/chat/completions \
  -H "Authorization: Bearer rc_live_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "messages": [
      { "role": "system", "content": "You are a helpful assistant." },
      { "role": "user", "content": "Explain quantum computing in one sentence." }
    ],
    "max_tokens": 256,
    "temperature": 0.7
  }'

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1711500000,
  "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing leverages quantum mechanical phenomena like superposition and entanglement to process information in ways that classical computers cannot."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 28,
    "total_tokens": 53
  }
}

Streaming

When stream is set to true, the response is delivered as server-sent events (SSE). Each event contains a JSON chunk with a partial response:

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1711500000,"model":"meta-llama/Meta-Llama-3.1-8B-Instruct","choices":[{"index":0,"delta":{"role":"assistant","content":"Quantum"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1711500000,"model":"meta-llama/Meta-Llama-3.1-8B-Instruct","choices":[{"index":0,"delta":{"content":" computing"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1711500000,"model":"meta-llama/Meta-Llama-3.1-8B-Instruct","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Each data: line contains a JSON object. The stream ends with data: [DONE].

Streaming Example

curl https://api.runcrate.ai/v1/chat/completions \
  -H "Authorization: Bearer rc_live_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "messages": [{ "role": "user", "content": "Hello" }],
    "stream": true
  }'

Overview

Chat

Images

Video

Audio

Create Chat Completion

Create Chat Completion

Request Body

Messages Format

Vision Input

Example Request

Response

Streaming

Streaming Example

Overview

Chat

Images

Video

Audio

​Create Chat Completion

​Request Body

​Messages Format

​Vision Input

​Example Request

​Response

​Streaming

​Streaming Example

Create Chat Completion

Request Body

Messages Format

Vision Input

Example Request

Response

Streaming

Streaming Example