ggml-org/gpt-oss-120b-GGUF

ggufbase_model:openai/gpt-oss-120bbase_model:quantized:openai/gpt-oss-120bendpoints_compatibleregion:usconversational
354.2K

gpt-oss-120b

Detailed guide for using this model with llama.cpp:

https://github.com/ggml-org/llama.cpp/discussions/15396

Quick start:

llama-server -hf ggml-org/gpt-oss-120b-GGUF -c 0 --jinja

# Then, access http://localhost:8080
DEPLOY IN 60 SECONDS

Run gpt-oss-120b-GGUF on Runcrate

Deploy on H100, A100, or RTX GPUs. Pay only for what you use. No setup required.