Evaluations
Run models against your data
Introducing Evaluations, a powerful feature designed to enable you to effortlessly test and compare a selection of AI models against your datasets.
Whether you're fine-tuning models or evaluating performance metrics, Oxen evaluations simplifies the process, allowing you to quickly and easily run prompts through an entire dataset.
Once you're happy with the results, output the resulting dataset to a new file, another branch, or directly as a new commit.
2fd563fe-0b38-4d71-b885-a288e8b57c4b
OpenAIOpenAI/GPT 4o minitext → text
Paco Aranda
frascuchon
3 months ago
Translate the following description into spanish:

{description}
completed 00:00:03 1 row sample244 tokens$ 0.0001 1 iteration
2569f1c1-44ca-4fb5-91de-863a48c8c005
OpenAIOpenAI/GPT 4o miniimage → text
Paco Aranda
frascuchon
4 months ago
Write a description for the following image 
{path}
completed 00:00:17 5 rows2698 tokens$ 0.0006 2 iterations
9b70f8be-c77f-4107-8db6-bfae41936bf1
OpenAIOpenAI/GPT 4o minitext → text
Paco Aranda
frascuchon
4 months ago
Apply the following prompt and provide the response in Spanish:

{prompt}
started (33.5%)Running 2698:39:50 653 / 1949 rows298098 tokens$ 0.1269 2 iterations
e08309cb-c96d-49c0-a0de-f79871551522
GoogleGoogle/Gemini 2.0 Flashimage → text
Paco Aranda
frascuchon
4 months ago
Describe the image defined in the following path:

{path}
completed 00:00:00 5 rows0 tokens$ 0.0000 2 iterations