Evaluations
Run models against your data
Introducing Evaluations, a powerful feature designed to enable you to effortlessly test and compare a selection of AI models against your datasets.
Whether you're fine-tuning models or evaluating performance metrics, Oxen evaluations simplifies the process, allowing you to quickly and easily run prompts through an entire dataset.
Once you're happy with the results, output the resulting dataset to a new file, another branch, or directly as a new commit.
e8049bcd-343d-428f-b8f6-3345578dc606
LilianZhou
4 months agoBased on the premise,determine whether to agree with the hypothesis. Respond with only 0, 1 or 2. 2 for Support, 0 for Oppose, or 1 for Neutral. premise: {premise} hypothesis: {hypothesis}
4c00221f-59ec-4771-a0cc-ad3e990516e5
LilianZhou
4 months agoAccording to the context, answer the question with only 1 or 0. 1 for yes and 0 for no context: {passage} question: {question}
conflict-main-2a556f1d-96da-402a-8bbe-15e1058c8d21
conflict-main-2a556f1d-96da-402a-8bbe-15e1058c8d21
bd87f366-457c-4190-8b4a-13f6abd77948
LilianZhou
4 months agoBased on the premise,determine whether to agree with the hypothesis. Respond with only 0, 1 or 2. 2 for Support, 0 for Oppose, or 1 for Neutral. premise: {premise} hypothesis: {hypothesis}
dc29989f-2d95-4996-9589-f14abacc86db
LilianZhou
4 months agoAccording to the context, answer the question with only 1 or 0. 1 for yes and 0 for no context: {passage} question: {question}
conflict-main-2a556f1d-96da-402a-8bbe-15e1058c8d21
conflict-main-2a556f1d-96da-402a-8bbe-15e1058c8d21
b939a272-43f2-46ac-9e26-3b3cef2f456a
LilianZhou
4 months agoBased on the premise,determine whether to agree with the hypothesis. Respond with only 0, 1 or 2. 2 for Support, 0 for Oppose, or 1 for Neutral. premise: {premise} hypothesis: {hypothesis}
a5631f40-d8b1-4343-99e1-32aa2ca08ad7
LilianZhou
4 months agoAccording to the context, answer the question with only 1 or 0. 1 for yes and 0 for no context: {passage} question: {question}
error no case clause matching: {:error, "resource_not_found"} 2700 / 3270 rows546534 tokens$ 0.4900 2 iterations
3f7688c6-01b2-4c3d-a50a-476c550ca736
LilianZhou
4 months agoAccording to the context, answer the question with only 1 or 0. 1 for yes and 0 for no context: {passage} question: {question}
228b59d1-f298-4849-a01a-93511511a5c4
LilianZhou
4 months agoAccording to the context, answer the question concisely. Try to answer in one or a few words. context: {paragraph} question: {question}
bbad6e19-ad2b-4c4b-a6bc-22f0e9f33ca3
LilianZhou
4 months agoAccording to the context, answer the question concisely. Try to answer in one or a few words. context: {paragraph} question: {question}
7638571d-fffc-43fc-8135-437bb7b9b1c5
LilianZhou
4 months agoUse the passage as context and pick the correct answer from given choices to put into the placeholder in the query. Do not output anthing besides the answer. passage: {passage} choices: {entities} query: {query}
c09749b7-c681-475e-8643-00fe5fed7d5c
LilianZhou
4 months agoUse the passage as context and pick the correct answer from given choices to put into the placeholder in the query. Do not output anthing besides the answer. passage: {passage} choices: {entities} query: {query}
18f4ac2d-9cab-407b-9bfb-5ac9996eaebc
LilianZhou
4 months agoUse the passage as context and pick the correct answer from given choices to put into the placeholder in the query. Do not output anything besides the answer. passage: {passage} choices: {entities} query: {query}
cb56be48-8656-48c1-be12-727cb2ae1b48
LilianZhou
4 months agoUse the passage as context and pick the correct answer from given choices to put into the placeholder in the query. Do not output anthing besides the answer. passage: {passage} query: {query} choices: {entities}