Evaluations
Run models against your data
Introducing Evaluations, a powerful feature designed to enable you to effortlessly test and compare a selection of AI models against your datasets.
Whether you're fine-tuning models or evaluating performance metrics, Oxen evaluations simplifies the process, allowing you to quickly and easily run prompts through an entire dataset.
Once you're happy with the results, output the resulting dataset to a new file, another branch, or directly as a new commit.
Generate answer relevance question embeddings
5f4c7930-4cfd-4245-a25e-39ea7db5d329
300 rows completed
Bessie
Bessie
3 days ago
Prompt: answer_relevance_questions
1 iteration 5563 tokens$ 0.0001
text → embeddingsopenaiOpenAI/Text Embedding 3 - Small
Generate question embeddings
95d7dfb7-0bf9-40ef-9853-e3f283451051
300 rows completed
Bessie
Bessie
3 days ago
Prompt: question
2 iterations 6405 tokens$ 0.0001
text → embeddingsopenaiOpenAI/Text Embedding 3 - Small
Context Relevance - Extract Relevant Sentences
b26c70c9-929f-488a-851b-4f41ed3b7d59
100 rows completed
Bessie
Bessie
4 days ago
Prompt: Please extract relevant sentences from the provided context that can potentially help answer the following question. If no relevant sentences are found, or if you believe the question cannot be answered from the given context, return the phrase "Insufficient Information". While extracting candidate sentences you're not allowed to make any changes to sentences from given context. Question: {question} Context: {rag_context}
2 iterations 123450 tokens$ 0.0230
text → textopenaiOpenAI/GPT-4o mini
Answer Relevance Question Generation
33d672d8-e463-4132-b895-1426d1b3c589
100 rowserror
Bessie
Bessie
4 days ago
Prompt: Generate 3 questions for the given the answer. Generate the questions in an ordered list: 1. 2. 3. Answer: {answer}
2 iterations 14131 tokens$ 0.0050
text → textopenaiOpenAI/GPT-4o mini
Statement Faithfulness To Context
a2a8d567-ab5f-42da-9578-0ea3f2daecf9
100 rows completed
Bessie
Bessie
4 days ago
Prompt: Consider the given context and following statements, then determine whether they are supported by the information present in the context. Provide a brief explanation for each statement before arriving at the final verdict (Yes/No). Provide a final vertict for each statement in order at the end in the given format. Do not deviate from the specified format. Context: {context} Statements: {faithfulness_statements}
3 iterations 38565 tokens$ 0.0166
text → textopenaiOpenAI/GPT-4o mini
Faithfulness Statements
9246a591-d95b-4781-8eb4-8487c1a7dc63
100 rows completed
Bessie
Bessie
4 days ago
Prompt: Given a question and an answer, create one or more statements from each sentence in the given answer. The statements should be in an ordered list such as 1. First Statement 2. Second Statement etc... question: {question} answer: {answer}
2 iterations 18912 tokens$ 0.0059
text → textopenaiOpenAI/GPT-4o mini
5235077a-645e-417d-9294-222971555232
5235077a-645e-417d-9294-222971555232
847 rows completed
Bessie
Bessie
4 days ago
Prompt: Consider the given context and following statements, then determine whether they are supported by the information present in the context. Provide a brief explanation for each statement before arriving at the final verdict (Yes/No). Provide a final vertict for each statement in order at the end in the given format. Do not deviate from the specified format. Context: {section} Statements: {statements}
2 iterations 712485 tokens$ 0.1946
text → textopenaiOpenAI/GPT-4o mini
Generate statements
a0d21249-16f8-4ae5-a485-8590619cba79
847 rows completed
Bessie
Bessie
4 days ago
Prompt: Given a question and an answer, create one or more statements from each sentence in the given answer. The statements should be in an ordered list such as 1. First Statement 2. Second Statement etc... question: {question} answer: {answer}
4 iterations 145873 tokens$ 0.0451
text → textopenaiOpenAI/GPT-4o mini
Answer Questions For Real
d70645d9-7bdd-4ff2-971c-b3b4680446b8
847 rows completed
Bessie
Bessie
3 weeks ago
Prompt: Answer the following question as succinctly as possible given the context. If the question cannot be answered given the context, respond with not_answerable. Context: {section} Quetion: {question} Answer:
2 iterations 487127 tokens$ 1.45
text → textopenaiOpenAI/GPT-4o
c783abcd-42c7-4cf0-8c6c-1e41680e7427
c783abcd-42c7-4cf0-8c6c-1e41680e7427
5 row sample completed
Bessie
Bessie
3 weeks ago
Prompt: Answer the following question as succinctly as possible given the context. If the question cannot be answered given the context, respond with not_answerable. Context: {section} Quetion: {question} Answer:
1 iteration 4309 tokens$ 0.0007
text → textopenaiOpenAI/GPT-4o mini
Answer questions
9bf059a9-9a44-47f9-99ee-390ed38d7eaa
5 row sample completed
Bessie
Bessie
3 weeks ago
Prompt: Answer the following question as succinctly as possible given the context. If the question cannot be answered given the context, respond with not_answerable. Context: {section} Question: {question} Answer:
4 iterations 5333 tokens$ 0.0261
text → textopenaiOpenAI/o1 mini
Compute title embeddings
168136aa-d514-4dd1-8117-f8fca5c8feae
36 rows completed
Bessie
Bessie
1 month ago
Prompt: paper_title
1 iteration 445 tokens$ 0.0000
text → embeddingsopenaiOpenAI/Text Embedding 3 - Small
Computing Embeddings for Sections
280b389a-2340-4514-a09b-de46921c872e
1635 rows completed
Bessie
Bessie
1 month ago
Prompt: section
2 iterations 843145 tokens$ 0.0169
text → embeddingsopenaiOpenAI/Text Embedding 3 - Small