Run Models on Your Data
Choose the right model, get to the perfect prompt, kick off the data flywheel.
Oxen makes it easy to improve your use of state of the art AI.
51 models, on 7 inference providers. New models added every week.
DeepSeekGoogleMetaMistral AINous ResearchOpenAIQwenShow all
Deepseek R1 (FP8)
Jan 2025
texttext
DeepSeek-R1 is an advanced large language model (LLM) that achieves performance comparable to OpenAI's o1 across math, code, and reasoning tasks.
Inference by Together.ai
Input: $7.00 / Output: $7.00
Deepseek R1
Jan 2025
texttext
DeepSeek R1 demonstrates emergent behaviors like chain-of-thought reasoning, self-verification, and error correction.
Inference by Fireworks AI
Input: $8.00 / Output: $8.00
Deepseek V3 (FP8)
Dec 2024
texttext
DeepSeek-V3 is a 671B parameter Mixture-of-Experts (MoE) language model with 37B active parameters per token
Inference by Together.ai
Input: $1.25 / Output: $1.25
Deepseek V3
Dec 2024
texttext
DeepSeek V3 is a state-of-the-art open-source large language model (LLM) with 671 billion parameters, of which 37 billion are activated per token.
Inference by Fireworks AI
Input: $0.90 / Output: $0.90
Hermes 3 70B
Dec 2024
texttext
Hermes 3 70B is the latest large language model in the Hermes series developed by NousResearch.
Inference by Lambda Labs
Input: $0.20 / Output: $0.20
Hermes 3 8B
Dec 2024
texttext
Hermes 3 8B is the latest iteration in the Hermes series of large language models, built on Llama 3.1 8B.
Inference by Lambda Labs
Input: $0.03 / Output: $0.03
Llama 3.3 70B Instruct
Dec 2024
texttext
Llama 3.3 70B Instruct is a powerful large language model developed by Meta AI.
Inference by Fireworks AI
Input: $0.90 / Output: $0.90
Llama 3.3 70B Speculative Decoding
Dec 2024
texttext
Llama 3.3 70B Instruct with Speculative Decoding is Meta's latest large language model optimized for efficient inference
Inference by Groq
Input: $0.59 / Output: $0.59
Llama 3.3 70B Instruct Turbo
Dec 2024
texttext
Llama 3.3 70B Instruct Turbo is a state-of-the-art large language model released by Meta on December 6, 2024.
Inference by Together.ai
Input: $0.88 / Output: $0.88
QwQ 32B Preview
Nov 2024
texttext
QwQ-32B-Preview is an experimental research model developed by the Qwen Team, focusing on advancing AI reasoning capabilities
Inference by Together.ai
Input: $1.20 / Output: $1.20
QwQ 32B Preview
Nov 2024
texttext
QwQ-32B-Preview is an experimental research model developed by the Qwen Team at Alibaba, focused on advancing AI reasoning capabilities.
Inference by Fireworks AI
Input: $0.90 / Output: $0.90
Qwen 2.5 Coder 32B Instruct
Nov 2024
texttext
Qwen2.5-Coder-32B-Instruct is a state-of-the-art open-source large language model for code generation, reasoning, and fixing.
Inference by Fireworks AI
Input: $0.90 / Output: $0.90
Qwen 2.5 Coder 32B Instruct
Nov 2024
texttext
Qwen2.5-Coder-32B-Instruct is a state-of-the-art open-source large language model for code generation, reasoning, and fixing.
Inference by Together.ai
Input: $0.80 / Output: $0.80
Llama 3.1 8B
Oct 2024
texttext
Llama 3.1 8B is a flexible model from Meta that is very capable for its size. It has a context length of 128k and can provide 8,192 output tokens.
Inference by Groq
Input: $0.05 / Output: $0.08
Llama 3.2 1B
Oct 2024
texttext
Llama 3.2 1B is an extremely small, fast, and cheap model from Meta that is very capable for its size.
Inference by Groq
Input: $0.04 / Output: $0.04
Ministral 8B
Oct 2024
texttext
Ministral 8B is an 8 billion parameter large language model developed by Mistral AI, featuring a unique interleaved sliding-window attention pattern for faster, memory-efficient inference.
Inference by Mistral AI
Input: $0.10 / Output: $0.10
Llama 3.1 Nemotron 70B Instruct
Oct 2024
texttext
Llama 3.1 Nemotron 70B Instruct is a large language model developed by NVIDIA, based on Meta's Llama 3.1 70B architecture
Inference by Together.ai
Input: $0.90 / Output: $0.90
Llama 3.2 11B Vision
Sep 2024
imagetext
Llama 3.2 11B Vision (Preview) is a powerful multimodal large language model developed by Meta
Inference by Groq
Input: $0.18 / Output: $0.18
Llama 3.2 3B
Sep 2024
texttext
Llama 3.2 3B is a lightweight, text-only large language model developed by Meta. It features 3 billion parameters and a 131,072 token context window
Inference by Groq
Input: $0.06 / Output: $0.06
Llama 3.2 90B Vision (Preview)
Sep 2024
imagetext
Llama 3.2 90B Vision (Preview) is a large multimodal language model developed by Meta, designed for image-in, text-out tasks.
Inference by Groq
Input: $0.90 / Output: $0.90
Llama 3.2 3B Instruct Turbo
Sep 2024
texttext
Llama 3.2 3B Instruct Turbo is a compact and efficient large language model developed by Meta AI
Inference by Together.ai
Input: $0.06 / Output: $0.06
Gemini 1.5 Flash - 8B
Sep 2024
texttext
Gemini 1.5 Flash-8B is a compact and efficient large language model developed by Google DeepMind.
Inference by Google
Input: $0.038 / Output: $0.15
Qwen2.5 72B Instruct
Sep 2024
texttext
Qwen2.5-72B-Instruct is a large language model with 72 billion parameters, part of the Qwen2.5 series.
Inference by Fireworks AI
Input: $0.90 / Output: $0.90
Qwen2 VL 72B Instruct
Sep 2024
imagetext
Qwen2.5-VL-72B-Instruct is a large vision-language model that combines advanced visual understanding with instruction-tuned language capabilities
Inference by Fireworks AI
Input: $0.90 / Output: $0.90
Mistral Small
Sep 2024
texttext
Mistral Small 3 is a cutting-edge 24B-parameter language model released in January 2025 by Mistral AI
Inference by Mistral AI
Input: $0.20 / Output: $0.60
o1 mini
Sep 2024
texttext
OpenAI's o1-mini is a smaller, faster, and more cost-effective version of the o1 model series, designed for complex reasoning tasks
Inference by OpenAI
Input: $3.00 / Output: $12.00
o1 preview
Sep 2024
texttext
OpenAI's o1-preview is an advanced large language model designed to excel at complex reasoning tasks.
Inference by OpenAI
Input: $15.00 / Output: $60.00
Pixtral 12B
Sep 2024
texttext
Pixtral 12B is an advanced multimodal AI model developed by Mistral AI. It combines a 12-billion-parameter text decoder with a vision encoder, enabling it to process and understand both text and images simultaneously.
Inference by Mistral AI
Input: $0.15 / Output: $0.15
Hermes 3 405B
Aug 2024
texttext
Hermes 3 405B is a frontier-level large language model developed by Nous Research, representing a full parameter finetune of the Llama-3.1 405B foundation model.
Inference by Lambda Labs
Input: $0.90 / Output: $0.90
Llama 3.3 70B Versatile 128k
Jul 2024
texttext
Llama 3.3 70B Instruct is the December update of Llama 3.1 70B and provides similar performance as 3.1 405B but with significant speed and cost improvements.
Inference by Groq
Input: $0.59 / Output: $0.79
Llama 3.1 70B Instruct
Jul 2024
texttext
Llama 3.1 70B Instruct is a large language model developed by Meta with 70 billion parameters.
Inference by Fireworks AI
Input: $0.90 / Output: $0.90
Llama 3.1 8B Instruct Turbo
Jul 2024
texttext
Meta Llama 3.1 8B Instruct Turbo is a multilingual large language model developed by Meta, it features a context window of 128k tokens.
Inference by Together.ai
Input: $0.18 / Output: $0.18
Llama 3.1 405B Instruct
Jul 2024
texttext
Llama 3.1 405B Instruct is Meta's largest open-source language model, released on July 23, 2024
Inference by Fireworks AI
Input: $3.00 / Output: $3.00
Llama 3.1 8B Instruct
Jul 2024
texttext
The Llama 3.1 8B Instruct model is an auto-regressive language model developed by Meta, featuring 8 billion parameters
Inference by Fireworks AI
Input: $0.20 / Output: $0.20
Llama 3.1 70B Instruct Turbo
Jul 2024
texttext
Meta Llama 3.1 70B Instruct Turbo is a large language model developed by Meta, released on July 23, 2024
Inference by Together.ai
Input: $0.88 / Output: $0.88
Llama 3.1 405B Instruct Turbo
Jul 2024
texttext
Meta Llama 3.1 405B Instruct Turbo is a state-of-the-art large language model developed by Meta.
Inference by Together.ai
Input: $3.50 / Output: $3.50
Mistral Nemo
Jul 2024
texttext
Mistral NeMo is a 12 billion parameter large language model (LLM) developed collaboratively by Mistral AI and NVIDIA
Inference by Mistral AI
Input: $0.15 / Output: $0.15
GPT 4o mini
Jul 2024
texttextimagetext
GPT-4o Mini is OpenAI's latest cost-efficient small model, offering advanced AI capabilities at a fraction of the cost of larger models.
Inference by OpenAI
Input: $0.15 / Output: $0.60
Gemma 2 9B Instruct
Jun 2024
texttext
Gemma 2 9B Instruct is an open-source large language model developed by Google, released in 2024
Inference by Groq
Input: $0.20 / Output: $0.20
Codestral
May 2024
texttext
Codestral 25.01 is Mistral AI's latest code generation model
Inference by Mistral AI
Input: $0.20 / Output: $0.60
Gemini 1.5 Pro
May 2024
texttext
Gemini 1.5 Pro is Google DeepMind's advanced multimodal large language model
Inference by Google
Input: $1.25 / Output: $5.00
Text Embedding 004
May 2024
textembeddings
Text Embedding 004 is a large language model designed for generating high-quality text embeddings.
Inference by Google
Input: $0.02 / Output: $0.02
Gemini 1.5 Flash
May 2024
texttext
Gemini 1.5 Flash is a Multimodal LLM designed for high-volume, rapid processing tasks. It excels in efficient handling of multimodal inputs (text, images, audio, and video) while maintaining a balance between performance and computational cost.
Some other noteworthy features of Gemini 1.5 Flash include its ability to process up to 1 million tokens in context and its optimization for speed and efficiency in tasks such as summarization, chat, image and video captioning, and data extraction.
| Metric | Value |
|--------------------|--------------------|
| Parameter Count | Unknown |
| Mixture of Experts | Yes |
| Context Length | 1,000,000 tokens |
| Multilingual | Yes |
| Quantized* | Unknown |
\*_Quantization is specific to the inference provider and the model may be offered with different quantization levels by other providers._
Inference by Google
Input: $0.075 / Output: $0.30
GPT 4o
May 2024
texttextimagetext
GPT-4o is OpenAI's latest large language model, released in May 2024
Inference by OpenAI
Input: $2.50 / Output: $10.00
Mixtral 8x22B
Apr 2024
texttext
Mixtral 8x22B is Mistral AI's most advanced open-source large language model, as of April 2024
Inference by Mistral AI
Input: $2.00 / Output: $6.00
Mistral Large
Feb 2024
texttext
Mistral Large is an LLM that excels in complex multilingual reasoning tasks, including text understanding, transformation, and code generation. It is particularly proficient in English, French, Spanish, German, and Italian, with a nuanced understanding of grammar and cultural context.
Some other noteworthy features of Mistral Large include its ability to handle long-form content with a 32K token context window and its native support for function calling and JSON output.
| Metric | Value |
|--------------------|--------------------|
| Parameter Count | 123 billion |
| Mixture of Experts | No |
| Context Length | 32,768 tokens |
| Multilingual | Yes |
| Quantized* | Unknown |
\*_Quantization is specific to the inference provider and the model may be offered with different quantization levels by other providers._
Inference by Mistral AI
Input: $2.00 / Output: $6.00
Text Embedding 3 - Small
Jan 2024
textembeddings
Text Embedding 3 - Small is an Embeddings Model designed for generating high-quality vector representations of text.
It excels in producing compact and meaningful embeddings for various natural language processing tasks, offering improved performance over its predecessor while maintaining efficiency in terms of latency and storage.
Some other noteworthy features of Text Embedding 3 - Small include multilingual support and the ability to adjust embedding dimensions.
| Metric | Value |
|--------------------|--------------------|
| Parameter Count | Unknown |
| Mixture of Experts | No |
| Context Length | Unknown |
| Multilingual | Yes |
| Quantized* | Unknown |
\*_Quantization is specific to the inference provider and the model may be offered with different quantization levels by other providers._
Inference by OpenAI
Input: $0.02 / Output: $0.02
Text Embedding 3 - Large
Jan 2024
textembeddings
Text Embedding 3 - Large is an advanced embedding model that converts text into high-dimensional vector representations.
Inference by OpenAI
Input: $0.13 / Output: $0.13
Mixtral 8x7B
Dec 2023
texttext
Mixtral 8x7B is an open-source large language model developed by Mistral AI. It utilizes a Sparse Mixture of Experts (SMoE) architecture, allowing access to 47 billion total parameters while only using 13 billion active parameters per token.
Inference by Mistral AI
Input: $0.70 / Output: $0.70
Mistral 7B
Sep 2023
texttext
Mistral 7B is an open-source large language model released by Mistral AI in September 2023
Inference by Mistral AI
Input: $0.25 / Output: $0.25
Ministral 3B
texttext
High quality edge model with only 3B parameters. Test this model through our UI, and then download the weights and run it on the edge on edge devices.
Inference by Mistral AI
Input: $0.04 / Output: $0.04