This repository provides the AWQ 4-bit quantized version of meta-llama/Llama-3.3-70B-Instruct, originally developed by Meta AI.
meta-llama/Llama-3.3-70B-Instruct
Run this model on powerful GPU infrastructure. Deploy in 60 seconds.
Deploy on H100, A100, or RTX GPUs. Pay only for what you use. No setup required.