Run Models on Your Data
Choose the right model, get to the perfect prompt, kick off the data flywheel.
Oxen makes it easy to improve your use of state of the art AI.
64 models, on 8 inference providers. New models added every week.

Mistral Small 3.1
Mar 2025
texttext
Building upon Mistral Small 3 (2501), Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance. With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks.
Input: $0.10 / Output: $0.30
Gemma 3 27B
Mar 2025
texttext
Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning.

Input: $0.20 / Output: $0.40
Gemini 2.0 Pro
Mar 2025
texttext
Google's most powerful Gemini 2.0 model, released in Feburary, 2025

Input: $1.25 / Output: $5.00
QwQ 32B
Mar 2025
texttext
Excels in complex reasoning tasks like math and coding. Uses reinforcement learning to match larger models' performance while remaining efficient.

Input: $0.90 / Output: $0.90
QwQ 32B
Mar 2025
texttext
Reasoning-focused LLM excelling in complex tasks like math and coding. Matches larger models' performance while running efficiently on consumer hardware.
Input: $1.20 / Output: $1.20
Claude 3.7 Sonnet
Feb 2025
texttextimagetext
Hybrid reasoning model combining rapid responses with deep analytical thinking, excelling in coding and complex problem-solving with 128K token context.
Input: $3.00 / Output: $15.00
Deepseek R1 (FP8)
Jan 2025
texttext
Efficient reasoning model using dynamic expert routing and chain-of-thought processing for complex tasks like math and coding.
Input: $7.00 / Output: $7.00
Deepseek R1
Jan 2025
texttext
Specializes in logical inference and mathematical problem-solving with chain-of-thought reasoning, using a Mixture of Experts architecture for efficient computation.

Input: $3.00 / Output: $8.00
Deepseek V3 (FP8)
Dec 2024
texttext
Efficient Mixture-of-Experts model optimized for complex reasoning, code generation, and multilingual tasks with scalable 128K context processing.
Input: $1.25 / Output: $1.25
Deepseek V3
Dec 2024
texttext
Efficient Mixture-of-Experts model with 37B active parameters per token, excelling in coding, math, and reasoning tasks while maintaining 128K token context length.

Input: $0.75 / Output: $3.00
Hermes 3 70B
Dec 2024
texttext
Advanced agentic capabilities with strong roleplaying, reasoning, and structured output generation for technical tasks.
Input: $0.20 / Output: $0.20
Hermes 3 8B
Dec 2024
texttext
Advanced agentic capabilities, roleplaying, reasoning, multi-turn conversation, long context coherence, and code generation with structured outputs.
Input: $0.03 / Output: $0.03
Llama 3.3 70B Speculative Decoding
Dec 2024
texttext
Optimized for speed via speculative decoding, excels in reasoning, coding, and complex tasks while maintaining high efficiency.

Input: $0.59 / Output: $0.59
Llama 3.3 70B Instruct
Dec 2024
texttext
Optimized for dialogue with strong reasoning, multilingual support, and efficient performance approaching larger models.

Input: $0.90 / Output: $0.90
Llama 3.3 70B Instruct Turbo
Dec 2024
texttext
Excels in question answering, reasoning, and code generation, with use cases including synthetic data creation and evaluating smaller model outputs.
Input: $0.88 / Output: $0.88
QwQ 32B Preview
Nov 2024
texttext
This experimental model excels in math, coding, and scientific reasoning, developed by Alibaba's Qwen Team to advance AI analytical capabilities.
Input: $1.20 / Output: $1.20
QwQ 32B Preview
Nov 2024
texttext
Specialized in advanced reasoning and problem-solving, excelling in mathematics and programming with a 32B parameter transformer architecture.

Input: $0.90 / Output: $0.90
Qwen 2.5 Coder 32B Instruct
Nov 2024
texttext
Specializes in code generation, reasoning, and fixing with 128K token context, open-source licensing, and local deployment capabilities.

Input: $0.90 / Output: $0.90
Qwen 2.5 Coder 32B Instruct
Nov 2024
texttext
Specializes in code generation, reasoning, and fixing across 40+ languages, matching GPT-4o's coding capabilities with 128k token context.
Input: $0.80 / Output: $0.80
Llama 3.1 8B
Oct 2024
texttext
Multilingual dialogue model optimized for tool integration and safety, with 128K context length for extended interactions.

Input: $0.05 / Output: $0.08
Llama 3.2 1B
Oct 2024
texttext
Lightweight model optimized for edge/mobile devices, excels in multilingual retrieval and summarization tasks with real-time processing and enhanced privacy.

Input: $0.04 / Output: $0.04
Ministral 8B
Oct 2024
texttext
Efficient edge model with native function calling and interleaved sliding-window attention for fast, memory-efficient processing in resource-constrained environments.
Input: $0.10 / Output: $0.10
Llama 3.1 Nemotron 70B Instruct
Oct 2024
texttext
Customized by NVIDIA to enhance helpfulness, this model excels in instruction-following tasks through human preference alignment and improved response relevance.
Input: $0.90 / Output: $0.90
Llama 3.2 11B Vision
Sep 2024
imagetext
Multimodal model processing text and images for visual reasoning, captioning, and document analysis with cross-attention architecture.

Input: $0.18 / Output: $0.18
Llama 3.2 3B
Sep 2024
texttext
Efficient for mobile/edge devices, excels in text summarization, classification, and translation. Ideal for AI writing assistants and customer service applications.

Input: $0.06 / Output: $0.06
Llama 3.2 3B Instruct Turbo
Sep 2024
texttext
Optimized for multilingual instruction-following tasks, balancing efficiency and performance in dialogue, summarization, and agentic applications with 3B parameters and scalable architecture.
Input: $0.06 / Output: $0.06
Llama 3.2 90B Vision (Preview)
Sep 2024
imagetext
Multimodal model for visual reasoning and image analysis, excels in coding, math, and multilingual tasks with 128k token context.

Input: $0.90 / Output: $0.90
Gemini 1.5 Flash - 8B
Sep 2024
texttext
Optimized for high-volume, cost-effective tasks with multimodal input support, excelling in transcription and long-context processing.

Input: $0.038 / Output: $0.15
Qwen2.5 72B Instruct
Sep 2024
texttext
Instruction-tuned LLM excelling in long-context processing (131K tokens), multilingual support (29+ languages), and structured data handling.

Input: $0.90 / Output: $0.90
Qwen2 VL 72B Instruct
Sep 2024
imagetext
Multimodal vision-language model excelling in dynamic image resolution handling, 20+ minute video processing, multilingual text understanding in images, and device operation via visual/text inputs.

Input: $0.90 / Output: $0.90
o1 mini
Sep 2024
texttext
Optimized for coding and math with Chain-of-Thought reasoning, offering fast, cost-efficient responses for complex problem-solving.
Input: $3.00 / Output: $12.00
o1 preview
Sep 2024
texttext
Reasoning-focused LLM for complex science, math, and coding tasks, generating detailed thought processes before responses.
Input: $15.00 / Output: $60.00
Pixtral 12B
Sep 2024
texttext
Multimodal model handling text and images at native resolution with 128K context window, excelling in visual reasoning tasks like document analysis and image captioning.
Input: $0.15 / Output: $0.15
Hermes 3 405B
Aug 2024
texttext
Advanced agentic capabilities with enhanced reasoning, roleplaying, and multi-turn conversation handling. Excels in structured output and long-context coherence.
Input: $0.90 / Output: $0.90
Llama 3.1 70B Instruct
Jul 2024
texttext
Multilingual LLM excelling in question answering, reasoning, code generation, and synthetic data generation.

Input: $0.90 / Output: $0.90
Llama 3.3 70B Versatile 128k
Jul 2024
texttext
Excels in multilingual tasks, tool use, coding, and reasoning with improved accuracy and efficient performance.

Input: $0.59 / Output: $0.79
Llama 3.1 8B Instruct Turbo
Jul 2024
texttext
Excels in multilingual dialogue and long-form text processing with strong reasoning for conversational agents and coding assistance.
Input: $0.18 / Output: $0.18
Llama 3.1 8B Instruct
Jul 2024
texttext
Optimized for multilingual dialogue with 128k context length, excels in chat, text generation, and language translation.

Input: $0.20 / Output: $0.20
Llama 3.1 70B Instruct Turbo
Jul 2024
texttext
Optimized for multilingual dialogue and long-context tasks, this model excels in production-scale applications with advanced inference capabilities and a 128k token context window.
Input: $0.88 / Output: $0.88
Llama 3.1 405B Instruct
Jul 2024
texttext
Optimized for multilingual dialogue with 128k context, instruction-tuned via SFT/RLHF, and enhanced with synthetic data for safety and performance.

Input: $3.00 / Output: $3.00
Llama 3.1 405B Instruct Turbo
Jul 2024
texttext
Instruction-tuned LLM excelling in multilingual dialogue, synthetic data generation, and model distillation with 131k token context for complex tasks.
Input: $3.50 / Output: $3.50
Mistral Nemo
Jul 2024
texttext
Handles long-form content with 128k token context, excels in multilingual tasks, coding, and function calling via natural language.
Input: $0.15 / Output: $0.15
GPT 4o mini
Jul 2024
texttextimagetext
Cost-efficient, fast model with 128K context window, supporting text/vision inputs and improved multilingual performance.
Input: $0.15 / Output: $0.60
Gemma 2 9B Instruct
Jun 2024
texttext
Efficient 9B parameter model trained on diverse web, code, and math data, excelling in coding and mathematical tasks.

Input: $0.20 / Output: $0.20
Codestral 2405
May 2024
texttext
Specializes in code generation with 32k token context, excelling in completion, debugging, and optimization across 80+ languages.
Input: $0.20 / Output: $0.60
Gemini 1.5 Pro
May 2024
texttext
Multimodal LLM with 2M token context, excels in complex reasoning, coding, and multimodal Q&A across text, images, audio, and video.

Input: $1.25 / Output: $5.00
Text Embedding 004
May 2024
textembeddings
Generates vector representations capturing semantic meaning/context for tasks like semantic search, text classification, and clustering. Multilingual support with versatile applications.

Input: $0.02 / Output: $0.02
Gemini 1.5 Flash
May 2024
texttext
Optimized for speed and efficiency, handles high-volume tasks with multimodal processing (text, images, video, audio) for summarization, chat, and data extraction.

Input: $0.075 / Output: $0.30
GPT 4o
May 2024
texttextimagetext
Multimodal LLM for real-time text, audio, and visual processing with multilingual support, emotional audio responses, and image generation.
Input: $2.50 / Output: $10.00
Mixtral 8x22B
Apr 2024
texttext
Efficient Sparse MoE architecture with 39B active parameters, excels in multilingual tasks, math, coding, and handles 64K token contexts.
Input: $2.00 / Output: $6.00
Mistral Large 2
Feb 2024
texttext
Powerful LLM with 123B parameters, excelling in multilingual tasks, coding, and reasoning, optimized for single-node inference and long-context applications.
Input: $2.00 / Output: $6.00
Text Embedding 3 - Small
Jan 2024
textembeddings
Generates compact, efficient embeddings for NLP tasks with multilingual support, balancing performance and low latency.
Input: $0.02 / Output: $0.02
Text Embedding 3 - Large
Jan 2024
textembeddings
Generates high-quality embeddings for complex text analysis and multilingual applications with 8,191 token context.
Input: $0.13 / Output: $0.13
Mixtral 8x7B
Dec 2023
texttext
Efficient Mixture of Experts (8 experts) with 13B active parameters, optimized for multilingual tasks and cost-performance balance.
Input: $0.70 / Output: $0.70
Mistral 7B
Sep 2023
texttext
Balanced performance in natural language and code tasks, efficiently handling longer sequences with innovative attention mechanisms.
Input: $0.25 / Output: $0.25
o3 mini
texttext
Optimized for STEM reasoning and problem-solving, excelling in complex tasks like advanced math and coding with improved cost efficiency.
Input: $1.10 / Output: $4.40
o1
texttext
Specializes in complex reasoning through chain-of-thought processing, excelling in STEM tasks like coding, math, and scientific analysis.
Input: $15.00 / Output: $60.00
GPT 4.5
texttext
Excels in natural conversation and creative tasks with improved emotional intelligence and multilingual support, prioritizing intuitive interactions over structured reasoning.
Input: $75.00 / Output: $150.00
Gemini 2.0 Flash Lite
texttextimagetext
Cost-efficient, budget-friendly multimodal LLM for real-time tasks with 1M token input context and enhanced performance.

Input: $0.075 / Output: $0.30
Deepseek R1 Distill Llama 70B
texttext
Specializes in complex problem-solving with step-by-step reasoning, excelling in math and coding tasks through chain-of-thought processing.

Input: $0.59 / Output: $0.79
Gemini 2.5 Pro Experimental
texttextimagetext
<needs_summary>

Input: $2.50 / Output: $5.00
Ministral 3B
texttext
Optimized for edge computing with function-calling capabilities, excelling in knowledge retrieval and commonsense reasoning with 128k token context.
Input: $0.04 / Output: $0.04
Gemini 2.0 Flash
texttextimagetext
Multimodal LLM for agentic applications, handling real-time data integration and multi-step tasks with enhanced reasoning via Thinking Mode, integrating Google tools and third-party functions.

Input: $0.10 / Output: $0.40
Codestral Latest
texttext
Specializes in coding tasks with multilingual support for 80+ languages, excelling in code generation, fill-in-the-middle, and test creation with a 256K token context.
Input: $0.30 / Output: $0.90