Run Models on Your Data

Choose the right model, get to the perfect prompt, kick off the data flywheel.
Oxen makes it easy to improve your use of state of the art AI.

64 models, on 8 inference providers. New models added every week.

Anthropic AIAnthropic AIDeepSeekDeepSeekGoogleGoogleMetaMetaMistral AIMistral AINous ResearchNous ResearchOpenAIOpenAIQwenQwenShow all
Mistral AIMistral Small 3.1
Mar 2025
texttext
Building upon Mistral Small 3 (2501), Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance. With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks.
Mistral AIInference by Mistral AI
Input: $0.10 / Output: $0.30
GoogleGemma 3 27B
Mar 2025
texttext
Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning.
GoogleInference by Google
Input: $0.20 / Output: $0.40
GoogleGemini 2.0 Pro
Mar 2025
texttext
Google's most powerful Gemini 2.0 model, released in Feburary, 2025
GoogleInference by Google
Input: $1.25 / Output: $5.00
QwenQwQ 32B
Mar 2025
texttext
Excels in complex reasoning tasks like math and coding. Uses reinforcement learning to match larger models' performance while remaining efficient.
Fireworks AIInference by Fireworks AI
Input: $0.90 / Output: $0.90
QwenQwQ 32B
Mar 2025
texttext
Reasoning-focused LLM excelling in complex tasks like math and coding. Matches larger models' performance while running efficiently on consumer hardware.
Together.aiInference by Together.ai
Input: $1.20 / Output: $1.20
Anthropic AIClaude 3.7 Sonnet
Feb 2025
texttextimagetext
Hybrid reasoning model combining rapid responses with deep analytical thinking, excelling in coding and complex problem-solving with 128K token context.
AnthropicInference by Anthropic
Input: $3.00 / Output: $15.00
DeepSeekDeepseek R1 (FP8)
Jan 2025
texttext
Efficient reasoning model using dynamic expert routing and chain-of-thought processing for complex tasks like math and coding.
Together.aiInference by Together.ai
Input: $7.00 / Output: $7.00
DeepSeekDeepseek R1
Jan 2025
texttext
Specializes in logical inference and mathematical problem-solving with chain-of-thought reasoning, using a Mixture of Experts architecture for efficient computation.
Fireworks AIInference by Fireworks AI
Input: $3.00 / Output: $8.00
DeepSeekDeepseek V3 (FP8)
Dec 2024
texttext
Efficient Mixture-of-Experts model optimized for complex reasoning, code generation, and multilingual tasks with scalable 128K context processing.
Together.aiInference by Together.ai
Input: $1.25 / Output: $1.25
DeepSeekDeepseek V3
Dec 2024
texttext
Efficient Mixture-of-Experts model with 37B active parameters per token, excelling in coding, math, and reasoning tasks while maintaining 128K token context length.
Fireworks AIInference by Fireworks AI
Input: $0.75 / Output: $3.00
Nous Research Hermes 3 70B
Dec 2024
texttext
Advanced agentic capabilities with strong roleplaying, reasoning, and structured output generation for technical tasks.
 Lambda LabsInference by Lambda Labs
Input: $0.20 / Output: $0.20
Nous Research Hermes 3 8B
Dec 2024
texttext
Advanced agentic capabilities, roleplaying, reasoning, multi-turn conversation, long context coherence, and code generation with structured outputs.
 Lambda LabsInference by Lambda Labs
Input: $0.03 / Output: $0.03
MetaLlama 3.3 70B Speculative Decoding
Dec 2024
texttext
Optimized for speed via speculative decoding, excels in reasoning, coding, and complex tasks while maintaining high efficiency.
GroqInference by Groq
Input: $0.59 / Output: $0.59
MetaLlama 3.3 70B Instruct
Dec 2024
texttext
Optimized for dialogue with strong reasoning, multilingual support, and efficient performance approaching larger models.
Fireworks AIInference by Fireworks AI
Input: $0.90 / Output: $0.90
MetaLlama 3.3 70B Instruct Turbo
Dec 2024
texttext
Excels in question answering, reasoning, and code generation, with use cases including synthetic data creation and evaluating smaller model outputs.
Together.aiInference by Together.ai
Input: $0.88 / Output: $0.88
QwenQwQ 32B Preview
Nov 2024
texttext
This experimental model excels in math, coding, and scientific reasoning, developed by Alibaba's Qwen Team to advance AI analytical capabilities.
Together.aiInference by Together.ai
Input: $1.20 / Output: $1.20
QwenQwQ 32B Preview
Nov 2024
texttext
Specialized in advanced reasoning and problem-solving, excelling in mathematics and programming with a 32B parameter transformer architecture.
Fireworks AIInference by Fireworks AI
Input: $0.90 / Output: $0.90
QwenQwen 2.5 Coder 32B Instruct
Nov 2024
texttext
Specializes in code generation, reasoning, and fixing with 128K token context, open-source licensing, and local deployment capabilities.
Fireworks AIInference by Fireworks AI
Input: $0.90 / Output: $0.90
QwenQwen 2.5 Coder 32B Instruct
Nov 2024
texttext
Specializes in code generation, reasoning, and fixing across 40+ languages, matching GPT-4o's coding capabilities with 128k token context.
Together.aiInference by Together.ai
Input: $0.80 / Output: $0.80
MetaLlama 3.1 8B
Oct 2024
texttext
Multilingual dialogue model optimized for tool integration and safety, with 128K context length for extended interactions.
GroqInference by Groq
Input: $0.05 / Output: $0.08
MetaLlama 3.2 1B
Oct 2024
texttext
Lightweight model optimized for edge/mobile devices, excels in multilingual retrieval and summarization tasks with real-time processing and enhanced privacy.
GroqInference by Groq
Input: $0.04 / Output: $0.04
Mistral AIMinistral 8B
Oct 2024
texttext
Efficient edge model with native function calling and interleaved sliding-window attention for fast, memory-efficient processing in resource-constrained environments.
Mistral AIInference by Mistral AI
Input: $0.10 / Output: $0.10
MetaLlama 3.1 Nemotron 70B Instruct
Oct 2024
texttext
Customized by NVIDIA to enhance helpfulness, this model excels in instruction-following tasks through human preference alignment and improved response relevance.
Together.aiInference by Together.ai
Input: $0.90 / Output: $0.90
MetaLlama 3.2 11B Vision
Sep 2024
imagetext
Multimodal model processing text and images for visual reasoning, captioning, and document analysis with cross-attention architecture.
GroqInference by Groq
Input: $0.18 / Output: $0.18
MetaLlama 3.2 3B
Sep 2024
texttext
Efficient for mobile/edge devices, excels in text summarization, classification, and translation. Ideal for AI writing assistants and customer service applications.
GroqInference by Groq
Input: $0.06 / Output: $0.06
MetaLlama 3.2 3B Instruct Turbo
Sep 2024
texttext
Optimized for multilingual instruction-following tasks, balancing efficiency and performance in dialogue, summarization, and agentic applications with 3B parameters and scalable architecture.
Together.aiInference by Together.ai
Input: $0.06 / Output: $0.06
MetaLlama 3.2 90B Vision (Preview)
Sep 2024
imagetext
Multimodal model for visual reasoning and image analysis, excels in coding, math, and multilingual tasks with 128k token context.
GroqInference by Groq
Input: $0.90 / Output: $0.90
GoogleGemini 1.5 Flash - 8B
Sep 2024
texttext
Optimized for high-volume, cost-effective tasks with multimodal input support, excelling in transcription and long-context processing.
GoogleInference by Google
Input: $0.038 / Output: $0.15
QwenQwen2.5 72B Instruct
Sep 2024
texttext
Instruction-tuned LLM excelling in long-context processing (131K tokens), multilingual support (29+ languages), and structured data handling.
Fireworks AIInference by Fireworks AI
Input: $0.90 / Output: $0.90
QwenQwen2 VL 72B Instruct
Sep 2024
imagetext
Multimodal vision-language model excelling in dynamic image resolution handling, 20+ minute video processing, multilingual text understanding in images, and device operation via visual/text inputs.
Fireworks AIInference by Fireworks AI
Input: $0.90 / Output: $0.90
OpenAIo1 mini
Sep 2024
texttext
Optimized for coding and math with Chain-of-Thought reasoning, offering fast, cost-efficient responses for complex problem-solving.
OpenAIInference by OpenAI
Input: $3.00 / Output: $12.00
OpenAIo1 preview
Sep 2024
texttext
Reasoning-focused LLM for complex science, math, and coding tasks, generating detailed thought processes before responses.
OpenAIInference by OpenAI
Input: $15.00 / Output: $60.00
Mistral AIPixtral 12B
Sep 2024
texttext
Multimodal model handling text and images at native resolution with 128K context window, excelling in visual reasoning tasks like document analysis and image captioning.
Mistral AIInference by Mistral AI
Input: $0.15 / Output: $0.15
Nous Research Hermes 3 405B
Aug 2024
texttext
Advanced agentic capabilities with enhanced reasoning, roleplaying, and multi-turn conversation handling. Excels in structured output and long-context coherence.
 Lambda LabsInference by Lambda Labs
Input: $0.90 / Output: $0.90
MetaLlama 3.1 70B Instruct
Jul 2024
texttext
Multilingual LLM excelling in question answering, reasoning, code generation, and synthetic data generation.
Fireworks AIInference by Fireworks AI
Input: $0.90 / Output: $0.90
MetaLlama 3.3 70B Versatile 128k
Jul 2024
texttext
Excels in multilingual tasks, tool use, coding, and reasoning with improved accuracy and efficient performance.
GroqInference by Groq
Input: $0.59 / Output: $0.79
MetaLlama 3.1 8B Instruct Turbo
Jul 2024
texttext
Excels in multilingual dialogue and long-form text processing with strong reasoning for conversational agents and coding assistance.
Together.aiInference by Together.ai
Input: $0.18 / Output: $0.18
MetaLlama 3.1 8B Instruct
Jul 2024
texttext
Optimized for multilingual dialogue with 128k context length, excels in chat, text generation, and language translation.
Fireworks AIInference by Fireworks AI
Input: $0.20 / Output: $0.20
MetaLlama 3.1 70B Instruct Turbo
Jul 2024
texttext
Optimized for multilingual dialogue and long-context tasks, this model excels in production-scale applications with advanced inference capabilities and a 128k token context window.
Together.aiInference by Together.ai
Input: $0.88 / Output: $0.88
MetaLlama 3.1 405B Instruct
Jul 2024
texttext
Optimized for multilingual dialogue with 128k context, instruction-tuned via SFT/RLHF, and enhanced with synthetic data for safety and performance.
Fireworks AIInference by Fireworks AI
Input: $3.00 / Output: $3.00
MetaLlama 3.1 405B Instruct Turbo
Jul 2024
texttext
Instruction-tuned LLM excelling in multilingual dialogue, synthetic data generation, and model distillation with 131k token context for complex tasks.
Together.aiInference by Together.ai
Input: $3.50 / Output: $3.50
Mistral AIMistral Nemo
Jul 2024
texttext
Handles long-form content with 128k token context, excels in multilingual tasks, coding, and function calling via natural language.
Mistral AIInference by Mistral AI
Input: $0.15 / Output: $0.15
OpenAIGPT 4o mini
Jul 2024
texttextimagetext
Cost-efficient, fast model with 128K context window, supporting text/vision inputs and improved multilingual performance.
OpenAIInference by OpenAI
Input: $0.15 / Output: $0.60
GoogleGemma 2 9B Instruct
Jun 2024
texttext
Efficient 9B parameter model trained on diverse web, code, and math data, excelling in coding and mathematical tasks.
GroqInference by Groq
Input: $0.20 / Output: $0.20
Mistral AICodestral 2405
May 2024
texttext
Specializes in code generation with 32k token context, excelling in completion, debugging, and optimization across 80+ languages.
Mistral AIInference by Mistral AI
Input: $0.20 / Output: $0.60
GoogleGemini 1.5 Pro
May 2024
texttext
Multimodal LLM with 2M token context, excels in complex reasoning, coding, and multimodal Q&A across text, images, audio, and video.
GoogleInference by Google
Input: $1.25 / Output: $5.00
GoogleText Embedding 004
May 2024
textembeddings
Generates vector representations capturing semantic meaning/context for tasks like semantic search, text classification, and clustering. Multilingual support with versatile applications.
GoogleInference by Google
Input: $0.02 / Output: $0.02
GoogleGemini 1.5 Flash
May 2024
texttext
Optimized for speed and efficiency, handles high-volume tasks with multimodal processing (text, images, video, audio) for summarization, chat, and data extraction.
GoogleInference by Google
Input: $0.075 / Output: $0.30
OpenAIGPT 4o
May 2024
texttextimagetext
Multimodal LLM for real-time text, audio, and visual processing with multilingual support, emotional audio responses, and image generation.
OpenAIInference by OpenAI
Input: $2.50 / Output: $10.00
Mistral AIMixtral 8x22B
Apr 2024
texttext
Efficient Sparse MoE architecture with 39B active parameters, excels in multilingual tasks, math, coding, and handles 64K token contexts.
Mistral AIInference by Mistral AI
Input: $2.00 / Output: $6.00
Mistral AIMistral Large 2
Feb 2024
texttext
Powerful LLM with 123B parameters, excelling in multilingual tasks, coding, and reasoning, optimized for single-node inference and long-context applications.
Mistral AIInference by Mistral AI
Input: $2.00 / Output: $6.00
OpenAIText Embedding 3 - Small
Jan 2024
textembeddings
Generates compact, efficient embeddings for NLP tasks with multilingual support, balancing performance and low latency.
OpenAIInference by OpenAI
Input: $0.02 / Output: $0.02
OpenAIText Embedding 3 - Large
Jan 2024
textembeddings
Generates high-quality embeddings for complex text analysis and multilingual applications with 8,191 token context.
OpenAIInference by OpenAI
Input: $0.13 / Output: $0.13
Mistral AIMixtral 8x7B
Dec 2023
texttext
Efficient Mixture of Experts (8 experts) with 13B active parameters, optimized for multilingual tasks and cost-performance balance.
Mistral AIInference by Mistral AI
Input: $0.70 / Output: $0.70
Mistral AIMistral 7B
Sep 2023
texttext
Balanced performance in natural language and code tasks, efficiently handling longer sequences with innovative attention mechanisms.
Mistral AIInference by Mistral AI
Input: $0.25 / Output: $0.25
OpenAIo3 mini
texttext
Optimized for STEM reasoning and problem-solving, excelling in complex tasks like advanced math and coding with improved cost efficiency.
OpenAIInference by OpenAI
Input: $1.10 / Output: $4.40
OpenAIo1
texttext
Specializes in complex reasoning through chain-of-thought processing, excelling in STEM tasks like coding, math, and scientific analysis.
OpenAIInference by OpenAI
Input: $15.00 / Output: $60.00
OpenAIGPT 4.5
texttext
Excels in natural conversation and creative tasks with improved emotional intelligence and multilingual support, prioritizing intuitive interactions over structured reasoning.
OpenAIInference by OpenAI
Input: $75.00 / Output: $150.00
GoogleGemini 2.0 Flash Lite
texttextimagetext
Cost-efficient, budget-friendly multimodal LLM for real-time tasks with 1M token input context and enhanced performance.
GoogleInference by Google
Input: $0.075 / Output: $0.30
DeepSeekDeepseek R1 Distill Llama 70B
texttext
Specializes in complex problem-solving with step-by-step reasoning, excelling in math and coding tasks through chain-of-thought processing.
GroqInference by Groq
Input: $0.59 / Output: $0.79
GoogleGemini 2.5 Pro Experimental
texttextimagetext
<needs_summary>
GoogleInference by Google
Input: $2.50 / Output: $5.00
Mistral AIMinistral 3B
texttext
Optimized for edge computing with function-calling capabilities, excelling in knowledge retrieval and commonsense reasoning with 128k token context.
Mistral AIInference by Mistral AI
Input: $0.04 / Output: $0.04
GoogleGemini 2.0 Flash
texttextimagetext
Multimodal LLM for agentic applications, handling real-time data integration and multi-step tasks with enhanced reasoning via Thinking Mode, integrating Google tools and third-party functions.
GoogleInference by Google
Input: $0.10 / Output: $0.40
Mistral AICodestral Latest
texttext
Specializes in coding tasks with multilingual support for 80+ languages, excelling in code generation, fill-in-the-middle, and test creation with a 256K token context.
Mistral AIInference by Mistral AI
Input: $0.30 / Output: $0.90