Evaluations
Run models against your data
Introducing Evaluations, a powerful feature designed to enable you to effortlessly test and compare a selection of AI models against your datasets.
Whether you're fine-tuning models or evaluating performance metrics, Oxen evaluations simplifies the process, allowing you to quickly and easily run prompts through an entire dataset.
Once you're happy with the results, output the resulting dataset to a new file, another branch, or directly as a new commit.
15082fd7-43ec-4b58-a85e-0963ad7a5bca
OpenAIOpenAI/Text Embedding 3 - Smalltext → embeddings
Lilian
2 months ago
question
completed 300 rows6405 tokens$ 0.0001 2 iterations
d3cde375-f55c-4a0a-8950-aa9a4e400dbe
OpenAIOpenAI/Text Embedding 3 - Smalltext → embeddings
Lilian
2 months ago
answer_relevance_questions
completed 300 rows6001 tokens$ 0.0001 2 iterations
c2d0aa08-0c38-4f68-9cdd-d1054b804896
OpenAIOpenAI/GPT 4o minitext → text
Lilian
2 months ago
Generate 3 questions for the given the answer. Generate the questions in an ordered list:

1.
2.
3.

Answer: {answer}
completed 100 rows19439 tokens$ 0.0060 2 iterations
2431bdb4-b306-4eee-8634-85d50f6d7960
OpenAIOpenAI/GPT 4o minitext → text
Lilian
2 months ago
Generate 3 questions for the given the answer. Generate the questions in an ordered list:

1.
2.
3.

Answer: {answer}
completed 5 row sample288 tokens$ 0.0001 1 iteration
37b4a24f-4a5c-483f-a619-7520cb81483f
OpenAIOpenAI/GPT 4o minitext → text
Lilian
2 months ago
Please extract relevant sentences from the provided context that can potentially help answer the following question. If no relevant sentences are found, or if you believe the question cannot be answered from the given context, return the phrase "Insufficient Information". While extracting candidate sentences you're not allowed to make any changes to sentences from given context.

Question:
{question}

Context:
{rag_context}
completed 100 rows94526 tokens$ 0.0179 2 iterations
OpenAIOpenAI/GPT 4o minitext → text
Lilian
2 months ago
Consider the given context and following statements, then determine whether they are supported by the information present in the context. Provide a brief explanation for each statement before arriving at the final verdict (Yes/No). Provide a final vertict for each statement in order at the end in the given format. Do not deviate from the specified format.

Context:
{rag_context}

Statements:
{faithfulness_statements}
completed 100 rows129866 tokens$ 0.0350 3 iterations
386299e2-e282-4144-b91b-dfef13311fa3
OpenAIOpenAI/GPT 4o minitext → text
Lilian
2 months ago
Given a question and an answer, create one or more statements from each sentence in the given answer.

The statements should be in an ordered list such as

1. First Statement
2. Second Statement
etc...

question: {question}

answer: {answer}
completed 100 rows27701 tokens$ 0.0091 2 iterations
5ddfeb2b-2492-4ff7-9908-41f25bcd9854
OpenAIOpenAI/GPT 4otext → text
Lilian
2 months ago
Considering the given context, answer the question.

Context:
{rag_context}

Question:
{question}

Answer:
completed 100 rows90436 tokens$ 0.2960 5 iterations
8f61c9e5-03e9-41d7-9990-3695d69466da
OpenAIOpenAI/Text Embedding 3 - Smalltext → embeddings
Lilian
2 months ago
chunk
completed 2600 rows1003712 tokens$ 0.0201 2 iterations
d2587963-f62c-4495-ac4a-82d05108dfd6
OpenAIOpenAI/Text Embedding 3 - Smalltext → embeddings
Lilian
2 months ago
chunk
completed 1903 rows938837 tokens$ 0.0188 3 iterations
b0a2d14e-0ed7-4b2e-89e4-69243a5b5114
OpenAIOpenAI/Text Embedding 3 - Smalltext → embeddings
Lilian
2 months ago
chunk
cancelled cancelled 22 / 1903 rows11277 tokens$ 0.0002 2 iterations