Documentation Index
Fetch the complete documentation index at: https://runcrate.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Alibaba’s Qwen family covers chat, code generation, vision understanding, and text-to-speech — all available through the Runcrate API with a single API key.
Available Qwen models
| Model | Category | Context | Strengths |
|---|
| Qwen3-Max | Chat | 128K | Flagship reasoning and instruction following |
| Qwen3.5-397B-A17B | Chat | 128K | MoE architecture, high throughput |
| Qwen3-Coder-480B-A35B-Instruct-Turbo | Code | 256K | Code generation, debugging, refactoring |
| Qwen3-VL-235B-A22B-Instruct | Vision | 128K | Image understanding, OCR, diagram analysis |
| Qwen3-TTS | TTS | — | Natural-sounding speech synthesis |
Chat — Qwen3-Max
from runcrate import Runcrate
client = Runcrate(api_key="rc_live_YOUR_API_KEY")
response = client.models.chat_completion(
model="Qwen/Qwen3-Max",
messages=[
{"role": "system", "content": "You are a helpful research assistant."},
{"role": "user", "content": "Compare microservices vs monolith for a team of 5 engineers."},
],
max_tokens=1024,
)
print(response.choices[0].message.content)
Code — Qwen3-Coder
Purpose-built for code generation with 256K context:
from runcrate import Runcrate
client = Runcrate(api_key="rc_live_YOUR_API_KEY")
response = client.models.chat_completion(
model="Qwen/Qwen3-Coder-480B-A35B-Instruct-Turbo",
messages=[
{"role": "system", "content": "Review code for bugs, style issues, and performance."},
{"role": "user", "content": "Review this:\n\n```python\ndef process(data):\n result = []\n for i in range(len(data)):\n if data[i] != None:\n result.append(data[i] * 2)\n return result\n```"},
],
max_tokens=1024,
)
print(response.choices[0].message.content)
Vision — Qwen3-VL
Analyze images, extract text, understand diagrams:
from runcrate import Runcrate
client = Runcrate(api_key="rc_live_YOUR_API_KEY")
response = client.models.chat_completion(
model="Qwen/Qwen3-VL-235B-A22B-Instruct",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "What does this chart show? Summarize the key trends."},
{"type": "image_url", "image_url": {"url": "https://example.com/chart.png"}},
],
}],
max_tokens=512,
)
print(response.choices[0].message.content)
TTS — Qwen3-TTS
from runcrate import Runcrate
client = Runcrate(api_key="rc_live_YOUR_API_KEY")
response = client.models.text_to_speech(
model="Qwen/Qwen3-TTS",
input="Welcome to Runcrate. Your GPU instances are ready.",
voice="alloy",
)
with open("welcome.mp3", "wb") as f:
f.write(response.content)
Choosing the right Qwen model
| Task | Model | Why |
|---|
| General chat, reasoning | Qwen3-Max | Best overall quality |
| High-throughput chat | Qwen3.5-397B-A17B | MoE — fast and cheap per token |
| Code generation, review | Qwen3-Coder-480B | 256K context, code-specialized |
| Image analysis, OCR | Qwen3-VL-235B | Vision-language understanding |
| Speech synthesis | Qwen3-TTS | Natural TTS output |
Tips
- Qwen3-Max is the safe default for most chat tasks.
- Qwen3.5 MoE activates only 17B params per token — use it when you need speed at scale.
- Qwen3-Coder handles 256K context for cross-file refactoring.
- Qwen3-VL supports multiple images in a single request.
Next steps