Use the DeepSeek V4 API

DeepSeek V4 Pro offers a 1M token context window — enough to process entire codebases, full-length books, or hundreds of pages of legal contracts in a single request. Access it through the Runcrate Models API with no GPU management and no waitlists.

Why DeepSeek V4 Pro

Feature	Detail
Context window	1,000,000 tokens
Architecture	Mixture-of-Experts
Best for	Long document analysis, code generation, multi-step reasoning
API compatibility	OpenAI-compatible chat completions

Basic chat completion

curl https://api.runcrate.ai/v1/chat/completions \
  -H "Authorization: Bearer rc_live_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-ai/DeepSeek-V4-Pro",
    "messages": [
      {"role": "user", "content": "Explain the mixture-of-experts architecture in three sentences."}
    ],
    "max_tokens": 512
  }'

Long document analysis (1M context)

The 1M context window means you can pass an entire document — a legal contract, research paper, or codebase — directly in the prompt. No chunking, no RAG pipeline, no lost context.

from openai import OpenAI
from pathlib import Path

client = OpenAI(
    base_url="https://api.runcrate.ai/v1",
    api_key="rc_live_YOUR_API_KEY",
)

# Load a full document — a 200-page contract, a novel, a codebase dump
document = Path("contract.txt").read_text()

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V4-Pro",
    messages=[
        {"role": "system", "content": "You are a legal analyst. Read the full contract and answer questions precisely, citing section numbers."},
        {"role": "user", "content": f"Here is the contract:\n\n{document}\n\nList every clause that limits liability, with the exact section number and a one-sentence summary of each."},
    ],
    max_tokens=4096,
)

print(response.choices[0].message.content)

Streaming

Add stream: true to any request and iterate over chunks as they arrive. Works with the OpenAI SDK in both Python (for chunk in stream) and TypeScript (for await ... of stream).

Next steps

Analyze long documents with AI — compare 1M-context models side by side.
Extract structured data — combine DeepSeek V4 with schema-based extraction.
Model catalog — browse all available models and pricing.

​Why DeepSeek V4 Pro

​Basic chat completion

​Long document analysis (1M context)

​Streaming

​Next steps

Why DeepSeek V4 Pro

Basic chat completion

Long document analysis (1M context)

Streaming

Next steps