Evaluations
Run models against your data
Introducing Evaluations, a powerful feature designed to enable you to effortlessly test and compare a selection of AI models against your datasets.
Whether you're fine-tuning models or evaluating performance metrics, Oxen evaluations simplifies the process, allowing you to quickly and easily run prompts through an entire dataset.
Once you're happy with the results, output the resulting dataset to a new file, another branch, or directly as a new commit.
9a894265-0e13-4c10-9e59-1d489dbd4478
OpenAIOpenAI/o1 minitext → text
Mathias Barragan
mathias
5 months ago
Answer the following question only using facts from the facts given after the question.
Keep your answer grounded in the facts given.
If no facts given after the question, return 'None'.
Question:
{query}


Facts:
{context}
completed 5 row sample2846 tokens$ 0.0250 1 iteration
3874d584-6546-453f-b219-3bcde65c2928
GoogleGoogle/Text Embedding 004text → embeddings
Bessie
ox
5 months ago
query
completed 5 row sample0 tokens$ 0.0000 1 iteration
d956c3ff-c6eb-4e5e-9b6c-d2148ac41404
OpenAIOpenAI/GPT 4otext → text
Bessie
ox
5 months ago
Are the following two answers equivalent? If the answers contain numeric values, only compare the numbers and not the words. Answer "true" or "false". All lowercase.

Answer 1: {answer}
Answer 2: {prediction}
gemini-flash-results
gemini-flash-results-judge
completed 200 rows15574 tokens$ 0.0404 1 iteration
660bf91a-1dc4-42b8-b4ca-35bd32a12a64
GoogleGoogle/Gemini 1.5 Flashtext → text
Bessie
ox
5 months ago
What is the answer to the question given the context? Only reply with text that is contained in the context.

Question:
{query}

Context:
{context}

Answer:
completed 200 rows63397 tokens$ 0.0055 2 iterations
521f0ef3-6bf8-4359-86fc-08d072ae99b6
OpenAIOpenAI/GPT 4otext → text
Bessie
ox
5 months ago
Are the following two answers equivalent? If the answers contain numeric values, only compare the numbers and not the words. Answer "true" or "false". All lowercase.

Answer 1: {answer}
Answer 2: {prediction}
openai-answer-extract
openai-answer-judgements
completed 200 rows16504 tokens$ 0.0428 3 iterations
aa5a77d7-7852-47df-bba4-0998fe94c176
OpenAIOpenAI/GPT 4o minitext → text
Bessie
ox
5 months ago
What is the answer to the question given the context? Only reply with text that is contained in the context.

Question:
{query}

Context:
{context}

Answer:
completed 200 rows59566 tokens$ 0.0107 1 iteration