Evaluations
Run models against your data
Introducing Evaluations, a powerful feature designed to enable you to effortlessly test and compare a selection of AI models against your datasets.
Whether you're fine-tuning models or evaluating performance metrics, Oxen evaluations simplifies the process, allowing you to quickly and easily run prompts through an entire dataset.
Once you're happy with the results, output the resulting dataset to a new file, another branch, or directly as a new commit.
997e05df-c1e3-41b8-8f1f-c9ac7e85b0bd
997e05df-c1e3-41b8-8f1f-c9ac7e85b0bd
5 row sample completed
Bessie
Bessie
4 days ago
Prompt: Tell me the color of the shirt in the image {image}
6 iterations 310 tokens$ 0.0001
imagetextgroqGroq/Llama 3.2 11B Vision
Source:
9d5f5f3a-a300-4297-b91b-5f548ccc6614
9d5f5f3a-a300-4297-b91b-5f548ccc6614
5 row sample completed
Bessie
Bessie
5 days ago
Prompt: {image} Write a query a customer might write when looking for this product.
1 iteration 3989 tokens$ 0.0105
imagetextopenaiOpenAI/GPT 4o
Source: