Evaluations
Run models against your data
Introducing Evaluations, a powerful feature designed to enable you to effortlessly test and compare a selection of AI models against your datasets.
Whether you're fine-tuning models or evaluating performance metrics, Oxen evaluations simplifies the process, allowing you to quickly and easily run prompts through an entire dataset.
Once you're happy with the results, output the resulting dataset to a new file, another branch, or directly as a new commit.
generate question embedding
15082fd7-43ec-4b58-a85e-0963ad7a5bca 300 rows completed
Lilian
3 weeks ago
Prompt: question
text → embeddings
OpenAI/Text Embedding 3 - Small
Generate answer relevance question embedding
d3cde375-f55c-4a0a-8950-aa9a4e400dbe 300 rows completed
Lilian
3 weeks ago
Prompt: answer_relevance_questions
text → embeddings
OpenAI/Text Embedding 3 - Small
Answer Relevance
c2d0aa08-0c38-4f68-9cdd-d1054b804896 100 rows completed
Lilian
3 weeks ago
Prompt: Generate 3 questions for the given the answer. Generate the questions in an ordered list:
1.
2.
3.
Answer: {answer}
text → text
OpenAI/GPT 4o mini
Answer Relevance
2431bdb4-b306-4eee-8634-85d50f6d7960 5 row sample completed
Lilian
3 weeks ago
Prompt: Generate 3 questions for the given the answer. Generate the questions in an ordered list:
1.
2.
3.
Answer: {answer}
text → text
OpenAI/GPT 4o mini
Source:
Context Relevance - extract relevant sentences
37b4a24f-4a5c-483f-a619-7520cb81483f 100 rows completed
Lilian
3 weeks ago
Prompt: Please extract relevant sentences from the provided context that can potentially help answer the following question. If no relevant sentences are found, or if you believe the question cannot be answered from the given context, return the phrase "Insufficient Information". While extracting candidate sentences you're not allowed to make any changes to sentences from given context.
Question:
{question}
Context:
{rag_context}
text → text
OpenAI/GPT 4o mini
Source:
Determine the statement whether can be determined by context
bd1832d2-7a60-4e5d-94d3-a786d35b666d 100 rows completed
Lilian
3 weeks ago
Prompt: Consider the given context and following statements, then determine whether they are supported by the information present in the context. Provide a brief explanation for each statement before arriving at the final verdict (Yes/No). Provide a final vertict for each statement in order at the end in the given format. Do not deviate from the specified format.
Context:
{rag_context}
Statements:
{faithfulness_statements}
text → text
OpenAI/GPT 4o mini
Generate Faithful Statements
386299e2-e282-4144-b91b-dfef13311fa3 100 rows completed
Lilian
3 weeks ago
Prompt: Given a question and an answer, create one or more statements from each sentence in the given answer.
The statements should be in an ordered list such as
1. First Statement
2. Second Statement
etc...
question: {question}
answer: {answer}
text → text
OpenAI/GPT 4o mini
Generate Answers
5ddfeb2b-2492-4ff7-9908-41f25bcd9854 100 rows completed
Lilian
3 weeks ago
Prompt: Considering the given context, answer the question.
Context:
{rag_context}
Question:
{question}
Answer:
text → text
OpenAI/GPT 4o
Source:
compute embeddings
8f61c9e5-03e9-41d7-9990-3695d69466da 2600 rows completed
Lilian
3 weeks ago
Prompt: chunk
text → embeddings
OpenAI/Text Embedding 3 - Small
Generate Embeddings
d2587963-f62c-4495-ac4a-82d05108dfd6 1903 rows completed
Lilian
4 weeks ago
Prompt: chunk
text → embeddings
OpenAI/Text Embedding 3 - Small
Target:
conflict-main-d7b077a5-01a3-4274-b724-bd87a627fc38
Compute embeddings for chunk
b0a2d14e-0ed7-4b2e-89e4-69243a5b5114 22 / 1903 rows cancelledcancelled
Lilian
1 month ago
Prompt: chunk
text → embeddings
OpenAI/Text Embedding 3 - Small