Run Llama 3.3 70B Speculative Decoding on your data

Llama 3.3 70B Speculative Decoding is an LLM that excels in a wide variety of tasks, such as question answering, reasoning, and code generation.

Some noteworthy use cases of Llama 3.3 70B Speculative Decoding include synthetic data generation and judging outputs from smaller models.

*Quantization is specific to the inference provider and the model may be offered with different quantization levels by other providers.

	Modality	Price (1M tokens)
Llama 3.1 405B Instruct	Fireworks AI	text	text	$3.00	$3.00
Llama 3.1 405B Instruct Turbo	Together.ai	text	text	$3.50	$3.50
Llama 3.1 70B Instruct	Fireworks AI	text	text	$0.90	$0.90
Llama 3.1 70B Instruct Turbo	Together.ai	text	text	$0.88	$0.88
Llama 3.1 8B	Groq	text	text	$0.05	$0.08
Llama 3.1 8B Instruct	Fireworks AI	text	text	$0.20	$0.20
Llama 3.1 8B Instruct Turbo	Together.ai	text	text	$0.18	$0.18
Llama 3.1 Nemotron 70B Instruct	Together.ai	text	text	$0.90	$0.90
Llama 3.2 11B Vision	Groq	image	text	$0.18	$0.18
Llama 3.2 1B	Groq	text	text	$0.04	$0.04
Llama 3.2 3B	Groq	text	text	$0.06	$0.06
Llama 3.2 3B Instruct Turbo	Together.ai	text	text	$0.06	$0.06
Llama 3.2 90B Vision (Preview)	Groq	image	text	$0.90	$0.90
Llama 3.3 70B Instruct	Fireworks AI	text	text	$0.90	$0.90
Llama 3.3 70B Instruct Turbo	Together.ai	text	text	$0.88	$0.88
Llama 3.3 70B Speculative Decoding	Groq	text	text	$0.59	$0.59
Llama 3.3 70B Versatile 128k	Groq	text	text	$0.59	$0.79
Llama 4 Maverick	Fireworks AI	text, image	text	$0.22	$0.88
Llama 4 Maverick	Together.ai	text, image	text	$0.27	$0.85
Llama 4 Scout	Together.ai	text, image	text	$0.18	$0.59
Llama 4 Scout	Fireworks AI	text, image	text	$0.15	$0.60

Modality

Price (1M tokens)

Model

Inference provider

Input

Output

Input

Output

Fireworks AI