Available models | Oxen.ai

Search

Run Models on Your Data

Choose the right model, get to the perfect prompt, kick off the data flywheel.
Oxen makes it easy to improve your use of state of the art AI.

88 models, on 11 inference providers. New models added every week.

Anthropic AI

DeepSeek

Google

Meta

Mistral AI

Nous Research

OpenAI

Perplexity

QwenShow all

Claude Opus 4

Excels at deep reasoning, complex coding, and autonomous agent workflows with sustained performance, extended thinking, tool use, and memory across tasks.

Inference by Anthropic

Price (per 1M tokens)

Input: $15.00 / Output: $75.00

Claude Sonnet 4

Balances intelligence with efficiency for coding, research, and automation tasks; excels in reasoning, content generation, and nuanced instruction following.

Inference by Anthropic

Price (per 1M tokens)

Input: $3.00 / Output: $15.00

Gemini 2.5 Pro Preview

texttextimagetext

Excels at building interactive web apps, advanced code editing and agentic workflows, with native multimodality and strong video-to-code capabilities.

Inference by Google

Price (per 1M tokens)

Input: $1.25 / Output: $10.00

Qwen 3 235B-A22B

MoE model with 22B active parameters featuring dual thinking modes for complex reasoning and efficient conversation across 100+ languages.

Inference by Fireworks AI

Price (per 1M tokens)

Input: $0.22 / Output: $0.88

Qwen 3 30B-A3B

MoE architecture with 3.3B active parameters, balancing efficiency with strong reasoning, multilingual capabilities, and specialized thinking mode.

Inference by Fireworks AI

Price (per 1M tokens)

Input: $0.90 / Output: $0.90

Gemini 2.5 Flash Preview

texttextimagetext

A thinking model offering enhanced reasoning with controllable "thinking" capabilities, balancing speed, cost, and performance for developers.

Inference by Google

Price (per 1M tokens)

Input: $0.15 / Output: $3.50

o4 mini

texttextimagetext

Optimized for fast, affordable reasoning with strong coding and visual skills, large 200k-token context, and efficient handling of complex tasks.

Inference by OpenAI

Price (per 1M tokens)

Input: $1.10 / Output: $4.40

o3

texttextimagetext

Excels at advanced reasoning, coding, math, and visual tasks with simulated reasoning, tool use, web browsing, and image understanding integration.

Inference by OpenAI

Price (per 1M tokens)

Input: $2.00 / Output: $8.00

GPT 4.1 nano

texttextimagetext

OpenAI's fastest, cost-effective model with full 1 million token context, optimized for classification, autocompletion, and real-time AI agent tasks.

Inference by OpenAI

Price (per 1M tokens)

Input: $0.10 / Output: $0.40

GPT 4.1 mini

texttextimagetext

Powerful mid-sized model with GPT-4o-level performance at lower cost and latency, featuring a 1 million token context window for complex tasks.

Inference by OpenAI

Price (per 1M tokens)

Input: $0.40 / Output: $1.60

GPT 4.1

texttextimagetext

Excels in coding and instruction following with million-token context window, enabling superior performance on complex, multi-step tasks.

Inference by OpenAI

Price (per 1M tokens)

Input: $2.00 / Output: $8.00

Llama 4 Maverick

texttextimagetext

Multimodal model with 17B active parameters, excelling at text, image, code, and multilingual tasks. Supports 1M-token context for advanced enterprise use.

Inference by Fireworks AI

Price (per 1M tokens)

Input: $0.22 / Output: $0.88

Llama 4 Scout

texttextimagetext

Efficient multimodal MoE model with 10M-token context, excelling at multi-document analysis, codebase reasoning, and image understanding across 12 languages.

Inference by Together.ai

Price (per 1M tokens)

Input: $0.18 / Output: $0.59

Llama 4 Scout

texttextimagetext

Multimodal model with 10M-token context, efficient MoE design, and strong performance in text, image, code, and multilingual reasoning tasks.

Inference by Fireworks AI

Price (per 1M tokens)

Input: $0.15 / Output: $0.60

Llama 4 Maverick

texttextimagetext

Multimodal model with 400B parameters, 128 experts, and 1M context; excels at multilingual text/image reasoning, coding, and enterprise-scale applications.

Inference by Together.ai

Price (per 1M tokens)

Input: $0.27 / Output: $0.85

Deepseek V3

Delivers advanced reasoning, code generation, and mathematical skills, processes long inputs efficiently, and accelerates results with innovative Mixture-of-Experts design.

Inference by DeepSeek

Price (per 1M tokens)

Input: $0.27 / Output: $1.10

Mistral Small 3.1

A lightweight, versatile 24B multimodal model handling text and images with extensive multilingual support and 128k token context window.

Inference by Mistral AI

Price (per 1M tokens)

Input: $0.10 / Output: $0.30

Gemma 3 27B

Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning.

Inference by Google

Price (per 1M tokens)

Input: $0.20 / Output: $0.40

Gemini 2.0 Pro

Google's most powerful Gemini 2.0 model, released in Feburary, 2025

Inference by Google

Price (per 1M tokens)

Input: $1.25 / Output: $5.00

Perplexity Sonar Deep Research

Performs exhaustive, multi-step research by autonomously searching and synthesizing hundreds of sources into detailed, expert-level reports across domains.

Inference by Perplexity

Price (per 1M tokens)

Input: $5.00 / Output: $15.00

Perplexity Sonar Reasoning Pro

Premium reasoning model for complex, multi-step analysis. Delivers detailed explanations, real-time web search, and double citations for thorough answers.

Inference by Perplexity

Price (per 1M tokens)

Input: $3.00 / Output: $10.00

QwQ 32B

Reasoning-focused LLM excelling in complex tasks like math and coding. Matches larger models' performance while running efficiently on consumer hardware.

Inference by Together.ai

Price (per 1M tokens)

Input: $1.20 / Output: $1.20

Claude 3.7 Sonnet

texttextimagetext

A hybrid reasoning model with standard and extended thinking modes, delivering twice the speed and exceptional performance in coding and problem-solving tasks.

Inference by Anthropic

Price (per 1M tokens)

Input: $3.00 / Output: $15.00

Perplexity Sonar Reasoning

Fast reasoning model with real-time web search, chain-of-thought capabilities, and citation support. Excels at complex queries with quick, accurate responses.

Inference by Perplexity

Price (per 1M tokens)

Input: $2.00 / Output: $6.00

Perplexity Sonar

Optimized for search-augmented tasks, delivering fast, accurate answers with real-time web data and detailed citations. Excels in research and fact-checking.

Inference by Perplexity

Price (per 1M tokens)

Input: $2.00 / Output: $2.00

Perplexity Sonar Pro

Excels at complex, multi-step queries with real-time web search, detailed answers, extensive citations, and customizable information retrieval.

Inference by Perplexity

Price (per 1M tokens)

Input: $5.00 / Output: $20.00

Deepseek R1

Employs a massive Mixture-of-Experts architecture and Multi-Layer Attention to deliver advanced, polished reasoning and problem-solving across math, code, and more.

Inference by Fireworks AI

Price (per 1M tokens)

Input: $3.00 / Output: $8.00

Deepseek R1 (FP8)

Excels at step-by-step reasoning and code generation, delivering transparent, structured answers through reinforcement learning and Mixture of Experts.

Inference by Together.ai

Price (per 1M tokens)

Input: $7.00 / Output: $7.00

Deepseek R1

An open-source reasoning model using Mixture-of-Experts architecture, delivering powerful math and code capabilities comparable to OpenAI's o1.

Inference by DeepSeek

Price (per 1M tokens)

Input: $0.55 / Output: $2.19

Deepseek V3

Powers advanced reasoning, code generation, and multilingual tasks with efficient MoE architecture and enhanced multi-token prediction for faster, optimized results.

Inference by Fireworks AI

Price (per 1M tokens)

Input: $0.75 / Output: $3.00

Deepseek V3 (FP8)

Excels at efficient reasoning and code generation, leveraging large-scale mixture-of-experts architecture with advanced multi-token prediction and training innovations.

Inference by Together.ai

Price (per 1M tokens)

Input: $1.25 / Output: $1.25

Hermes 3 70B

Advanced agentic capabilities with strong roleplaying, reasoning, and structured output generation for technical tasks.

Inference by Lambda Labs

Price (per 1M tokens)

Input: $0.20 / Output: $0.20

Hermes 3 8B

Advanced agentic capabilities, roleplaying, reasoning, multi-turn conversation, long context coherence, and code generation with structured outputs.

Inference by Lambda Labs

Price (per 1M tokens)

Input: $0.03 / Output: $0.03

Llama 3.3 70B Speculative Decoding

Optimized for speed via speculative decoding, excels in reasoning, coding, and complex tasks while maintaining high efficiency.

Inference by Groq

Price (per 1M tokens)

Input: $0.59 / Output: $0.59

Llama 3.3 70B Instruct

Optimized for dialogue with strong reasoning, multilingual support, and efficient performance approaching larger models.

Inference by Fireworks AI

Price (per 1M tokens)

Input: $0.90 / Output: $0.90

Llama 3.3 70B Instruct Turbo

Excels in question answering, reasoning, and code generation, with use cases including synthetic data creation and evaluating smaller model outputs.

Inference by Together.ai

Price (per 1M tokens)

Input: $0.88 / Output: $0.88

QwQ 32B Preview

This experimental model excels in math, coding, and scientific reasoning, developed by Alibaba's Qwen Team to advance AI analytical capabilities.

Inference by Together.ai

Price (per 1M tokens)

Input: $1.20 / Output: $1.20

QwQ 32B Preview

Specialized in advanced reasoning and problem-solving, excelling in mathematics and programming with a 32B parameter transformer architecture.

Inference by Fireworks AI

Price (per 1M tokens)

Input: $0.90 / Output: $0.90

Qwen 2.5 Coder 32B Instruct

Specializes in code generation, reasoning, and fixing with 128K token context, open-source licensing, and local deployment capabilities.

Inference by Fireworks AI

Price (per 1M tokens)

Input: $0.90 / Output: $0.90

Qwen 2.5 Coder 32B Instruct

Specializes in code generation, reasoning, and fixing across 40+ languages, matching GPT-4o's coding capabilities with 128k token context.

Inference by Together.ai

Price (per 1M tokens)

Input: $0.80 / Output: $0.80

Llama 3.1 8B

Multilingual dialogue model optimized for tool integration and safety, with 128K context length for extended interactions.

Inference by Groq

Price (per 1M tokens)

Input: $0.05 / Output: $0.08

Llama 3.2 1B

Lightweight model optimized for edge/mobile devices, excels in multilingual retrieval and summarization tasks with real-time processing and enhanced privacy.

Inference by Groq

Price (per 1M tokens)

Input: $0.04 / Output: $0.04

Claude 3.5 Sonnet

Powerful AI with exceptional coding abilities, twice the speed of previous versions, and advanced reasoning for complex software development tasks.

Inference by Anthropic

Price (per 1M tokens)

Input: $3.00 / Output: $15.00

Claude 3.5 Haiku

Anthropic's fastest model offering advanced coding, tool use, and reasoning capabilities with rapid response times for real-time applications and personalized experiences.

Inference by Anthropic

Price (per 1M tokens)

Input: $0.80 / Output: $4.00

Ministral 8B

Efficient edge model with native function calling and interleaved sliding-window attention for fast, memory-efficient processing in resource-constrained environments.

Inference by Mistral AI

Price (per 1M tokens)

Input: $0.10 / Output: $0.10

Llama 3.1 Nemotron 70B Instruct

Customized by NVIDIA to enhance helpfulness, this model excels in instruction-following tasks through human preference alignment and improved response relevance.

Inference by Together.ai

Price (per 1M tokens)

Input: $0.90 / Output: $0.90

Qwen2.5 1.5B Instruct

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:

Inference by Bytez

Price (per second)

Llama 3.2 11B Vision

Multimodal model processing text and images for visual reasoning, captioning, and document analysis with cross-attention architecture.

Inference by Groq

Price (per 1M tokens)

Input: $0.18 / Output: $0.18

Llama 3.2 3B

Efficient for mobile/edge devices, excels in text summarization, classification, and translation. Ideal for AI writing assistants and customer service applications.

Inference by Groq

Price (per 1M tokens)

Input: $0.06 / Output: $0.06

Llama 3.2 3B Instruct Turbo

Optimized for multilingual instruction-following tasks, balancing efficiency and performance in dialogue, summarization, and agentic applications with 3B parameters and scalable architecture.

Inference by Together.ai

Price (per 1M tokens)

Input: $0.06 / Output: $0.06

Llama 3.2 90B Vision (Preview)

Multimodal model for visual reasoning and image analysis, excels in coding, math, and multilingual tasks with 128k token context.

Inference by Groq

Price (per 1M tokens)

Input: $0.90 / Output: $0.90

Gemini 1.5 Flash - 8B

Optimized for high-volume, cost-effective tasks with multimodal input support, excelling in transcription and long-context processing.

Inference by Google

Price (per 1M tokens)

Input: $0.038 / Output: $0.15

Qwen2.5 72B Instruct

Instruction-tuned LLM excelling in long-context processing (131K tokens), multilingual support (29+ languages), and structured data handling.

Inference by Fireworks AI

Price (per 1M tokens)

Input: $0.90 / Output: $0.90

o1 mini

Optimized for coding and math with Chain-of-Thought reasoning, offering fast, cost-efficient responses for complex problem-solving.

Inference by OpenAI

Price (per 1M tokens)

Input: $3.00 / Output: $12.00

o1 preview

Reasoning-focused LLM for complex science, math, and coding tasks, generating detailed thought processes before responses.

Inference by OpenAI

Price (per 1M tokens)

Input: $15.00 / Output: $60.00

Pixtral 12B

Multimodal model handling text and images at native resolution with 128K context window, excelling in visual reasoning tasks like document analysis and image captioning.

Inference by Mistral AI

Price (per 1M tokens)

Input: $0.15 / Output: $0.15

Hermes 3 405B

Advanced agentic capabilities with enhanced reasoning, roleplaying, and multi-turn conversation handling. Excels in structured output and long-context coherence.

Inference by Lambda Labs

Price (per 1M tokens)

Input: $0.90 / Output: $0.90

Llama 3.1 70B Instruct

Multilingual LLM excelling in question answering, reasoning, code generation, and synthetic data generation.

Inference by Fireworks AI

Price (per 1M tokens)

Input: $0.90 / Output: $0.90

Llama 3.3 70B Versatile 128k

Excels in multilingual tasks, tool use, coding, and reasoning with improved accuracy and efficient performance.

Inference by Groq

Price (per 1M tokens)

Input: $0.59 / Output: $0.79

Llama 3.1 8B Instruct Turbo

Excels in multilingual dialogue and long-form text processing with strong reasoning for conversational agents and coding assistance.

Inference by Together.ai

Price (per 1M tokens)

Input: $0.18 / Output: $0.18

Llama 3.1 8B Instruct

Optimized for multilingual dialogue with 128k context length, excels in chat, text generation, and language translation.

Inference by Fireworks AI

Price (per 1M tokens)

Input: $0.20 / Output: $0.20

Llama 3.1 70B Instruct Turbo

Optimized for multilingual dialogue and long-context tasks, this model excels in production-scale applications with advanced inference capabilities and a 128k token context window.

Inference by Together.ai

Price (per 1M tokens)

Input: $0.88 / Output: $0.88

Llama 3.1 405B Instruct

Optimized for multilingual dialogue with 128k context, instruction-tuned via SFT/RLHF, and enhanced with synthetic data for safety and performance.

Inference by Fireworks AI

Price (per 1M tokens)

Input: $3.00 / Output: $3.00

Llama 3.1 405B Instruct Turbo

Instruction-tuned LLM excelling in multilingual dialogue, synthetic data generation, and model distillation with 131k token context for complex tasks.

Inference by Together.ai

Price (per 1M tokens)

Input: $3.50 / Output: $3.50

Mistral Nemo

Handles long-form content with 128k token context, excels in multilingual tasks, coding, and function calling via natural language.

Inference by Mistral AI

Price (per 1M tokens)

Input: $0.15 / Output: $0.15

GPT 4o mini

texttextimagetext

Cost-efficient, fast model with 128K context window, supporting text/vision inputs and improved multilingual performance.

Inference by OpenAI

Price (per 1M tokens)

Input: $0.15 / Output: $0.60

Gemma 2 9B Instruct

Efficient 9B parameter model trained on diverse web, code, and math data, excelling in coding and mathematical tasks.

Inference by Groq

Price (per 1M tokens)

Input: $0.20 / Output: $0.20

Codestral 2405

Specializes in code generation with 32k token context, excelling in completion, debugging, and optimization across 80+ languages.

Inference by Mistral AI

Price (per 1M tokens)

Input: $0.20 / Output: $0.60

Gemini 1.5 Pro

Multimodal LLM with 2M token context, excels in complex reasoning, coding, and multimodal Q&A across text, images, audio, and video.

Inference by Google

Price (per 1M tokens)

Input: $1.25 / Output: $5.00

Text Embedding 004

Generates vector representations capturing semantic meaning/context for tasks like semantic search, text classification, and clustering. Multilingual support with versatile applications.

Inference by Google

Price (per 1M tokens)

Input: $0.02 / Output: $0.02

Gemini 1.5 Flash

Optimized for speed and efficiency, handles high-volume tasks with multimodal processing (text, images, video, audio) for summarization, chat, and data extraction.

Inference by Google

Price (per 1M tokens)

Input: $0.075 / Output: $0.30

GPT 4o

texttextimagetext

Multimodal LLM for real-time text, audio, and visual processing with multilingual support, emotional audio responses, and image generation.

Inference by OpenAI

Price (per 1M tokens)

Input: $2.50 / Output: $10.00

Mixtral 8x22B

Efficient Sparse MoE architecture with 39B active parameters, excels in multilingual tasks, math, coding, and handles 64K token contexts.

Inference by Mistral AI

Price (per 1M tokens)

Input: $2.00 / Output: $6.00

Mistral Large 2

Powerful LLM with 123B parameters, excelling in multilingual tasks, coding, and reasoning, optimized for single-node inference and long-context applications.

Inference by Mistral AI

Price (per 1M tokens)

Input: $2.00 / Output: $6.00

Text Embedding 3 - Small

Generates compact, efficient embeddings for NLP tasks with multilingual support, balancing performance and low latency.

Inference by OpenAI

Price (per 1M tokens)

Input: $0.02 / Output: $0.02

Text Embedding 3 - Large

Generates high-quality embeddings for complex text analysis and multilingual applications with 8,191 token context.

Inference by OpenAI

Price (per 1M tokens)

Input: $0.13 / Output: $0.13

Mixtral 8x7B

Efficient Mixture of Experts (8 experts) with 13B active parameters, optimized for multilingual tasks and cost-performance balance.

Inference by Mistral AI

Price (per 1M tokens)

Input: $0.70 / Output: $0.70

DALL-E 3

Translates nuanced text prompts into detailed, accurate images with automatic prompt rewriting, multiple aspect ratios, and ChatGPT integration for creative workflows[1][2][6].

Inference by OpenAI

Price (per second)

Mistral 7B

Balanced performance in natural language and code tasks, efficiently handling longer sequences with innovative attention mechanisms.

Inference by Mistral AI

Price (per 1M tokens)

Input: $0.25 / Output: $0.25

o3 mini

Optimized for STEM reasoning and problem-solving, excelling in complex tasks like advanced math and coding with improved cost efficiency.

Inference by OpenAI

Price (per 1M tokens)

Input: $1.10 / Output: $4.40

Deepseek R1 Distill Llama 70B

Delivers strong mathematical and coding abilities, matching the performance of larger models while using efficient distillation and multilingual support.

Inference by Groq

Price (per 1M tokens)

Input: $0.59 / Output: $0.79

o1

Specializes in complex reasoning through chain-of-thought processing, excelling in STEM tasks like coding, math, and scientific analysis.

Inference by OpenAI

Price (per 1M tokens)

Input: $15.00 / Output: $60.00

GPT 4.5

Excels in natural conversation and creative tasks with improved emotional intelligence and multilingual support, prioritizing intuitive interactions over structured reasoning.

Inference by OpenAI

Price (per 1M tokens)

Input: $75.00 / Output: $150.00

Gemini 2.0 Flash Lite

texttextimagetext

Cost-efficient, budget-friendly multimodal LLM for real-time tasks with 1M token input context and enhanced performance.

Inference by Google

Price (per 1M tokens)

Input: $0.075 / Output: $0.30

Ministral 3B

Optimized for edge computing with function-calling capabilities, excelling in knowledge retrieval and commonsense reasoning with 128k token context.

Inference by Mistral AI

Price (per 1M tokens)

Input: $0.04 / Output: $0.04

Gemini 2.0 Flash

texttextimagetext

Multimodal LLM for agentic applications, handling real-time data integration and multi-step tasks with enhanced reasoning via Thinking Mode, integrating Google tools and third-party functions.

Inference by Google

Price (per 1M tokens)

Input: $0.10 / Output: $0.40

Codestral Latest

Specializes in coding tasks with multilingual support for 80+ languages, excelling in code generation, fill-in-the-middle, and test creation with a 256K token context.

Inference by Mistral AI

Price (per 1M tokens)

Input: $0.30 / Output: $0.90

Gemini 2.5 Pro Experimental

texttextimagetext

Handles complex reasoning and coding tasks, generates and interprets multimodal content, and supports interactive visualizations with an extensive 1M token context.

Inference by Google

Price (per 1M tokens)

Input: $2.50 / Output: $5.00