Documentation Index
Fetch the complete documentation index at: https://runcrate.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
For Agents
Fetch the complete documentation index at: https://runcrate.ai/docs/llms.txt Use this file to discover all available pages before exploring further.
Introducing Runcrate
Runcrate is the complete platform for AI teams to access open-source models and GPU compute. One account gives you production inference for 140+ models, on-demand GPU instances, dedicated clusters, and the SDKs to build with all of it.Quickstart
Make your first API call in under 60 seconds.
Model Catalog
Browse 140+ open-source models across text, image, video, and audio.
SDKs
Python and TypeScript clients. Drop-in OpenAI SDK replacements.
API Reference
Full REST API documentation for inference and infrastructure.
The Runcrate Platform
Everything your AI team needs: production inference, GPU compute, and dedicated clusters — all under one account and one bill.Inference Engine
OpenAI-compatible API for 140+ open-source models. Chat, image, video, TTS, ASR — billed per token or per generation.
GPU Compute
On-demand instances and dedicated clusters. H100, H200, B200, B300 with root SSH access.
Models API
Chat completions, image generation, video, TTS, and transcription endpoints.
GPU Instances
Deploy containers or VMs with dedicated NVIDIA GPUs in 60 seconds.
Storage
Persistent volumes with a built-in file explorer. Data survives instance termination.
Dedicated Clusters
Reserved bare-metal clusters from 16 to 128+ nodes with InfiniBand.
Explore use cases
See how teams use Runcrate to build AI products, run inference at scale, train models, and deploy custom servers.AI SaaS Backend
Build a production AI backend with chat, image generation, and RAG.
RAG Pipeline
Build retrieval-augmented generation with embeddings and vector search.
Fine-tune LLMs
Fine-tune Llama, Mistral, or Qwen on your own data with GPU instances.
Video Generation
Generate videos with Kling, Veo, Sora, and Seedance APIs.
Start building
Python SDK
Official Python client. Drop-in replacement for the OpenAI SDK.
TypeScript SDK
Official TypeScript client for Node.js and edge runtimes.
Vercel AI SDK
First-class Runcrate provider for the Vercel AI SDK.
MCP Server
Control Runcrate from Claude, Cursor, or any MCP-compatible AI assistant. Deploy instances, manage storage, and monitor usage with natural language.
CLI Overview
Deploy instances, SSH in, transfer files, and manage volumes from your terminal.
CLI Installation
Install on macOS, Linux, or Windows and authenticate in 30 seconds.
Which product do you need?
| Inference Engine | Compute | |
|---|---|---|
| Best for | Building AI features on open-source models | Training, fine-tuning, custom inference servers, reserved capacity |
| Billing | Per token / per generation | Per hour (instances) · Monthly (dedicated) |
| Setup time | 60 seconds | 60 seconds (instances) · 1–2 weeks (dedicated) |
| Commitment | None | None (instances) · 12–24 months (dedicated) |
| Access | Self-serve · API key | Self-serve (instances) · Contact sales (dedicated) |
| GPUs | Managed for you | H100, H200, B200, B300, A100, L40S, RTX 4090 |