Run Models on Your Data
Choose the right model, get to the perfect prompt, kick off the data flywheel.
Oxen makes it easy to improve your use of state of the art AI.
51 models, from 7 providers. New models added every week.
Mistral AI
Ministral 3B
High quality edge model with only 3B parameters. Test this model through our UI, and then download the weights and run it on the edge on edge devices.
Input: $0.04 / Output: $0.04
texttext
Ministral 8B
Powerful edge model with extremely high performance/price ratio.
Input: $0.10 / Output: $0.10
texttext
Pixtral 12B
A 12B model with image understanding capabilities in addition to text.
Input: $0.15 / Output: $0.15
texttext
Mistral Nemo
Multilingual open source model.
Input: $0.15 / Output: $0.15
texttext
Open Mistral 7B
Decoder-only transformer for chat based purposes.
Input: $0.25 / Output: $0.25
texttext
Mistral Small
Enterprise-grade 22B parameter small model with the lastest version v2 released September 2024.
Input: $0.20 / Output: $0.60
texttext
Codestral
Cutting-edge language model for coding.
Input: $0.20 / Output: $0.60
texttext
Open Mixtral 8x7B
Sparse Mixture-of-Experts (MoE) model with a total of 45 billion parameters. Best model overall regarding cost/performance trade-offs.
Input: $0.70 / Output: $0.70
texttext
Mistral Large
Reasoning model for high-complexity tasks with the lastest version v2 released July 2024.
Input: $2.00 / Output: $6.00
texttext
Open Mixtral 8x22B
Sparse Mixture-of-Experts (SMoE) model that uses only 39B active parameters out of 141B, offering unparalleled cost efficiency for its size.
Input: $2.00 / Output: $6.00
texttext
OpenAI
Text Embedding 3 - Small
Smaller text embeddings with a dimension of
Input: $0.02 / Output: $0.02
textembeddings
Text Embedding 3 - Large
Larger text embeddings
Input: $0.13 / Output: $0.13
textembeddings
GPT-4o mini
Affordable and intelligent small model for fast, lightweight tasks
Input: $0.15 / Output: $0.60
texttextimagetext
GPT-4o
High-intelligence flagship model for complex, multistep tasks.
Input: $2.50 / Output: $10.00
texttextimagetext
o1 mini
More efficient than o1-preview with similar reasoning capabilities.
Input: $3.00 / Output: $12.00
texttext
o1 preview
Language model trained with reinforcement learning to perform complex reasoning.
Input: $15.00 / Output: $60.00
texttext
Fireworks AI
Llama v3.1 8B Instruct
The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models with 8B
Input: $0.20 / Output: $0.20
texttext
Llama v3.1 70B Instruct
The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models with 70B
Input: $0.90 / Output: $0.90
texttext
Qwen 2.5 Coder 32B Instruct
Qwen 2.5 Coder is the latest series of code-specific Qwen large language models.
Input: $0.90 / Output: $0.90
texttext
Qwen2.5 72B Instruct
Qwen2.5 are a series of decoder-only language models developed by Qwen team, Alibaba
Input: $0.90 / Output: $0.90
texttext
Qwen2 VL 72B Instruct
The 72B variant of the latest iteration of Qwen-VL model from Alibaba, representing nearly a year of innovation.
Input: $0.90 / Output: $0.90
imagetext
Deepseek V3
A a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token from Deepseek.
Input: $0.90 / Output: $0.90
texttext
Llama v3.3 70b Instruct
Llama 3.3 70B Instruct is the December update of Llama 3.1 70B. The model improves upon Llama 3.1 70B (released July 2024) with advances in tool calling, multilingual text support, math and coding. The model achieves industry leading results in reasoning, math and instruction following and provides similar performance as 3.1 405B but with significant speed and cost improvements.
Input: $0.90 / Output: $0.90
texttext
Qwen QwQ 32B Preview
Qwen QwQ model focuses on advancing AI reasoning, and showcases the power of open models to match closed frontier model performance.QwQ-32B-Preview is an experimental release, comparable to o1 and surpassing GPT-4o and Claude 3.5 Sonnet on analytical and reasoning abilities across GPQA, AIME, MATH-500 and LiveCodeBench benchmarks.
Input: $0.90 / Output: $0.90
texttext
Llama v3.1 405B Instruct
The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models with 405B
Input: $3.00 / Output: $3.00
texttext
Deepseek R1
DeepSeek-R1 is a state-of-the-art large language model optimized with reinforcement learning and cold-start data for exceptional reasoning, math, and code performance.
Input: $8.00 / Output: $8.00
texttext
Together.ai
Llama 3.2 3B Instruct Turbo
Llama 3.2 3B Instruct is a lightweight, multilingual model from Meta. The model is designed for efficiency and offers substantial latency and cost improvements compared to larger models.
Input: $0.06 / Output: $0.06
texttext
Meta Llama 3.1 8B Instruct Turbo
8 billion parameter instruct-tuned decoder-only handles complex language tasks with high accuracy and efficiency.
Input: $0.18 / Output: $0.18
texttext
Qwen 2.5 Coder 32B Instruct
Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models. It has excellent capability for its size.
Input: $0.80 / Output: $0.80
texttext
Meta Llama 3.1 70B Instruct Turbo
70 billion parameter instruct-tuned decoder-only handles complex language tasks with high accuracy and efficiency.
Input: $0.88 / Output: $0.88
texttext
Llama 3.3 70B Instruct Turbo
More powerful than Llama 3.1 70B
Input: $0.88 / Output: $0.88
texttext
Llama 3.1 Nemotron 70B Instruct
Llama 3.1 Nemotron 70B Instruct is a large language model fine tuned by NVIDIA to improve the helpfulness of LLM generated responses to user queries. It scores better than the original Llama 3.1 70B on benchmarks and in the LLM Arena.
Input: $0.90 / Output: $0.90
texttext
QwQ 32B Preview
Qwen QwQ model focuses on advancing AI reasoning, and showcases the power of open models to match closed frontier model performance. QwQ 32B Preview is an experimental release, comparable to o1 and surpassing GPT-4o and Claude 3.5 Sonnet on analytical and reasoning abilities across GPQA, AIME, MATH-500 and LiveCodeBench benchmarks.
Input: $1.20 / Output: $1.20
texttext
Deepseek V3 (FP8)
A a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token from Deepseek.
Input: $1.25 / Output: $1.25
texttext
Meta Llama 3.1 405B Instruct Turbo
405 billion parameter instruct-tuned decoder-only handles complex language tasks with high accuracy and efficiency.
Input: $3.50 / Output: $3.50
texttext
Deepseek R1 (FP8)
DeepSeek-R1 is a state-of-the-art large language model optimized with reinforcement learning and cold-start data for exceptional reasoning, math, and code performance.
Input: $7.00 / Output: $7.00
texttext
Text Embedding 004
Gemini API generates state-of-the-art embeddings for words, phrases, and sentences
Input: $0.02 / Output: $0.02
textembeddings
Gemini 1.5 Flash - 8B
High volume and lower intelligence tasks
Input: $0.04 / Output: $0.15
texttext
Gemini 1.5 Flash
Our fastest multimodal model with great performance for diverse, repetitive tasks and a 1 million context window. Now generally available for production
Input: $0.08 / Output: $0.30
texttext
Gemini 1.5 Pro
Gemini 1.5 Pro is a mid-size multimodal model that is optimized for a wide-range of reasoning tasks.
Input: $1.25 / Output: $5.00
texttext
Groq
Llama 3.2 1B
Llama 3.2 1B is an extremely small, fast, and cheap model from Meta that is very capable for its size.
Input: $0.04 / Output: $0.04
texttext
Llama 3.2 3B
Llama 3.2 3B instruct is a lightweight, multilingual model from Meta. The model is designed for efficiency and offers substantial latency and cost improvements compared to larger models.
Input: $0.06 / Output: $0.06
texttext
Llama 3.1 8B
Llama 3.1 8B is a flexible model from Meta that is very capable for its size.
Input: $0.05 / Output: $0.08
texttext
Llama 3.2 11B Vision (Preview)
A powerful multimodal model capable of processing both text and image inputs that supports multilingual, multi-turn conversations, tool use, and JSON mode.
Input: $0.18 / Output: $0.18
imagetext
Gemma 2 9B Instruct
Gemma 2 9B Instruct is a lightweight open model from Google, built from the same research and technology used to create the Gemini models.
Input: $0.20 / Output: $0.20
texttext
Llama 3.3 70B Speculative Decoding
The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out).
Input: $0.59 / Output: $0.59
texttext
Llama 3.3 70B Versatile 128k
Llama 3.3 70B Instruct is the December update of Llama 3.1 70B. The model improves upon Llama 3.1 70B (released July 2024) with advances in tool calling, multilingual text support, math and coding. The model achieves industry leading results in reasoning, math and instruction following and provides similar performance as 3.1 405B but with significant speed and cost improvements.
Input: $0.59 / Output: $0.79
texttext
Llama 3.2 90B Vision (Preview)
A powerful multimodal model capable of processing both text and image inputs that supports multilingual, multi-turn conversations, tool use, and JSON mode.
Input: $0.90 / Output: $0.90
imagetext
Lambda Labs
Hermes 3 8B
Hermes 3 is the latest version of the flagship Hermes series of LLMs by Nous Research
Input: $0.03 / Output: $0.03
texttext
Hermes 3 70B
Hermes 3 is the latest version of the flagship Hermes series of LLMs by Nous Research
Input: $0.20 / Output: $0.20
texttext
Hermes 3 405B
Hermes 3 is the latest version of the flagship Hermes series of LLMs by Nous Research
Input: $0.90 / Output: $0.90
texttext