Models FAQ
How do I call the API?
How do I call the API?
The Runcrate Models API is OpenAI-compatible. Point your existing OpenAI SDK or HTTP client to
https://api.runcrate.ai/v1 and use your Runcrate API key. See the Quickstart for examples.Can I use the OpenAI SDK?
Can I use the OpenAI SDK?
Yes. Both the Python and JavaScript OpenAI SDKs work with Runcrate. Just change the
base_url (Python) or baseURL (JavaScript) to https://api.runcrate.ai/v1 and use your Runcrate API key.Which models are best for my use case?
Which models are best for my use case?
| Use Case | Recommended Models |
|---|---|
| General chat and assistants | GPT-4o, Claude 4 Sonnet, DeepSeek-V3 |
| Complex reasoning and math | DeepSeek-R1, QwQ |
| Code generation | Codestral, DeepSeek-Coder, Qwen-Coder |
| Image generation | FLUX.1, FLUX.2, Stable Diffusion |
| Video generation | Sora 2, Veo 3, Kling |
| Voice synthesis | Kokoro, Orpheus |
| Transcription | Whisper |
Is streaming supported?
Is streaming supported?
Yes. Streaming is supported for all chat completion models. Set
stream: true in your request to receive responses as server-sent events.How is model usage billed?
How is model usage billed?
- Text models (chat, code, reasoning) — billed per token (input + output)
- Image generation — billed per image
- Video generation — billed per video
- Text-to-speech — billed per generation
- Speech-to-text — billed per audio minute
What is the Playground?
What is the Playground?
The Playground is a web-based interface in the dashboard where you can test any model interactively without writing code. Go to Dashboard → Playground to try it. It uses your project’s default API key.
Are there rate limits?
Are there rate limits?
Yes, rate limits apply to prevent abuse. Limits vary by model and account. If you hit a rate limit, you will receive a
429 Too Many Requests response. Wait briefly and retry.How do I check model pricing?
How do I check model pricing?
Go to the Model Catalog in the dashboard or the Pricing page in the docs. Each model shows its per-token or per-generation rate.