Skip to main content

Runcrate Models API

The Runcrate Models API provides OpenAI-compatible endpoints for text generation, image creation, video generation, and audio processing. Use your existing OpenAI SDK or any HTTP client to get started immediately.

Base URL

https://api.runcrate.ai/v1

Authentication

All requests require a Bearer token in the Authorization header. You can generate an API key from the Dashboard.
curl https://api.runcrate.ai/v1/chat/completions \
  -H "Authorization: Bearer rc_live_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "meta-llama/Meta-Llama-3.1-8B-Instruct", "messages": [{"role": "user", "content": "Hello"}]}'

Endpoints

MethodPathDescription
POST/v1/chat/completionsCreate a chat completion
POST/v1/images/generationsGenerate an image
POST/v1/videosSubmit a video generation job
GET/v1/videos/{id}Get video generation status
GET/v1/videos/{id}/downloadDownload a completed video
POST/v1/audio/speechGenerate speech from text
POST/v1/audio/transcriptionsTranscribe audio to text

HTTP Status Codes

CodeDescription
200Request succeeded
400Bad request — invalid or missing parameters
401Unauthorized — invalid or missing API key
429Rate limit exceeded — too many requests
500Internal server error

Error Response Format

All errors return a consistent JSON structure:
{
  "error": {
    "message": "A human-readable description of the error.",
    "type": "invalid_request_error",
    "code": "invalid_api_key"
  }
}

Rate Limiting

API requests are rate-limited per API key. If you exceed the limit, you will receive a 429 response. The response includes the following headers:
HeaderDescription
X-RateLimit-LimitMaximum requests allowed per minute
X-RateLimit-RemainingRequests remaining in the current window
X-RateLimit-ResetUnix timestamp when the rate limit resets
If you need higher rate limits, contact us at support@runcrate.ai.