Pricing
Runcrate offers three billable products: the Models API, GPU Instances, and Cloud Storage. All usage is deducted from your prepaid credit balance.Models API
Models are billed per usage:| Type | Billing Unit | Notes |
|---|---|---|
| Text models (Chat, Code, Reasoning) | Per token (input + output) | Varies by model |
| Image generation | Per image generated | Varies by model and resolution |
| Video generation | Per video generated | Varies by model and duration |
| Text-to-speech | Per generation | Varies by model and length |
| Speech-to-text | Per audio minute | Varies by model |
GPU Instances
Instances are billed per hour while running. Approximate rates:| GPU | VRAM | Approx. Price/Hour |
|---|---|---|
| RTX 4090 | 24 GB | ~$0.30 |
| L40S | 48 GB | ~$0.80 |
| A100 40GB | 40 GB | ~$1.20 |
| A100 80GB | 80 GB | ~$1.80 |
| H100 | 80 GB | ~$2.50 |
Prices are approximate and may vary based on availability. Check the deployment page for current pricing when you configure an instance.
Cloud Storage
| Resource | Price |
|---|---|
| Storage volume | $0.03/GB/month (billed weekly) |
Billing Rules
| Rule | Details |
|---|---|
| Billing interval | Hourly for instances, per-request for API, weekly for storage |
| Minimum spend | None — pay only for what you use |
| Minimum top-up | $5 |
| Terminate anytime | Stop an instance at any time and stop being charged |
| Prepaid credits | All usage is deducted from your credit balance |
| Egress fees | None |
Cost Optimization Tips
Terminate Idle Instances
Instances are billed while running, even if idle. Terminate instances you are not actively using.
Use the Right GPU
Do not pay for an H100 when an RTX 4090 handles your workload. Start small and scale up only if needed.
Enable Auto-Recharge
Avoid losing work from unexpected termination. Set a threshold that gives you enough buffer time.
Monitor Burn Rate
Check your billing dashboard regularly to track spend and adjust usage patterns.