Evaluations
Run models against your data
Introducing Evaluations, a powerful feature designed to enable you to effortlessly test and compare a selection of AI models against your datasets.
Whether you're fine-tuning models or evaluating performance metrics, Oxen evaluations simplifies the process, allowing you to quickly and easily run prompts through an entire dataset.
Once you're happy with the results, output the resulting dataset to a new file, another branch, or directly as a new commit.
6bfdb7d2-70da-4840-b33f-dc2bd0325d43
OpenAIOpenAI/DALL-E 3text → image
Mathias
mathi
3 days ago
Generate a similar image:
{path}
completed 5 row sample0 tokens$ 0.2000 1 iteration
7db6c1d7-2174-460e-b7cd-f3b769421f8c
OpenAIOpenAI/GPT 4o minitext → text
Mathias
mathi
5 days ago
I am to pass a job desc. give me a 1 if its a plumber, a 2 if contractor, 
{title}
completed 5 row sample579 tokens$ 0.0003 1 iteration
69fa58a8-2a9c-48e0-97ff-bbc3a147bfc6
OpenAIOpenAI/GPT 4o miniimage → text
Mathias
mathi
5 days ago
what is this :
{path}
completed 5 row sample42811 tokens$ 0.0065 1 iteration
cbe00a1b-3e9b-4523-add6-0da9512fda87
OpenAIOpenAI/GPT 4o minitext → text
Mathias
mathi
5 days ago
How many "a" in the title:
{title}
completed 5 row sample197 tokens$ 0.0001 1 iteration
defc7aa6-7ed9-4b87-82e5-fa4b4b3d72bd
OpenAIOpenAI/GPT 4o minitext → text
Mathias
mathi
5 days ago
How many "a" are there in "{title}"
completed 5 row sample199 tokens$ 0.0001 1 iteration
26be7b57-59cf-45cd-af9c-31a483d43dfc
OpenAIOpenAI/GPT 4o minitext → text
Mathias
mathi
5 days ago
How many "a" are in the question:
{title}
completed 5 row sample204 tokens$ 0.0001 1 iteration
60493aa5-0872-4930-92f5-8f712dfe1b01
OpenAIOpenAI/DALL-E 3text → image
Mathias
mathi
1 week ago
Generate an image of an ox
completed 5 row sample0 tokens$ 0.2000 1 iteration