Evaluations
Run models against your data
Introducing Evaluations, a powerful feature designed to enable you to effortlessly test and compare a selection of AI models against your datasets.
Whether you're fine-tuning models or evaluating performance metrics, Oxen evaluations simplifies the process, allowing you to quickly and easily run prompts through an entire dataset.
Once you're happy with the results, output the resulting dataset to a new file, another branch, or directly as a new commit.
e8049bcd-343d-428f-b8f6-3345578dc606
OpenAIOpenAI/GPT 4o minitexttext
LilianZhou
4 months ago
Based on the premise,determine whether to agree with the hypothesis. Respond with only 0, 1 or 2. 2 for Support, 0 for Oppose, or 1 for Neutral.
premise:
{premise}
hypothesis:
{hypothesis}
completed 56 rows7950 tokens$ 0.0012 3 iterations
4c00221f-59ec-4771-a0cc-ad3e990516e5
QwenQwen/Qwen2.5 72B Instructtexttext
LilianZhou
4 months ago
According to the context, answer the question with only 1 or 0. 1 for yes and 0 for no
context:
{passage}
question:
{question}
conflict-main-2a556f1d-96da-402a-8bbe-15e1058c8d21
conflict-main-2a556f1d-96da-402a-8bbe-15e1058c8d21
completed 3270 rows662171 tokens$ 0.5960 5 iterations
bd87f366-457c-4190-8b4a-13f6abd77948
MetaMeta/Llama 3.1 70B Instructtexttext
LilianZhou
4 months ago
Based on the premise,determine whether to agree with the hypothesis. Respond with only 0, 1 or 2. 2 for Support, 0 for Oppose, or 1 for Neutral.
premise:
{premise}
hypothesis:
{hypothesis}
completed 56 rows8646 tokens$ 0.0078 2 iterations
dc29989f-2d95-4996-9589-f14abacc86db
MetaMeta/Llama 3.1 70B Instructtexttext
LilianZhou
4 months ago
According to the context, answer the question with only 1 or 0. 1 for yes and 0 for no
context:
{passage}
question:
{question}
conflict-main-2a556f1d-96da-402a-8bbe-15e1058c8d21
conflict-main-2a556f1d-96da-402a-8bbe-15e1058c8d21
completed 3270 rows600464 tokens$ 0.5404 2 iterations
b939a272-43f2-46ac-9e26-3b3cef2f456a
QwenQwen/Qwen2.5 72B Instructtexttext
LilianZhou
4 months ago
Based on the premise,determine whether to agree with the hypothesis. Respond with only 0, 1 or 2. 2 for Support, 0 for Oppose, or 1 for Neutral.
premise:
{premise}
hypothesis:
{hypothesis}
completed 56 rows9435 tokens$ 0.0085 4 iterations
a5631f40-d8b1-4343-99e1-32aa2ca08ad7
QwenQwen/Qwen2.5 72B Instructtexttext
LilianZhou
4 months ago
According to the context, answer the question with only 1 or 0. 1 for yes and 0 for no
context:
{passage}
question:
{question}
error no case clause matching: {:error, "resource_not_found"} 2700 / 3270 rows546534 tokens$ 0.4900 2 iterations
3f7688c6-01b2-4c3d-a50a-476c550ca736
QwenQwen/Qwen2.5 72B Instructtexttext
LilianZhou
4 months ago
According to the context, answer the question with only 1 or 0. 1 for yes and 0 for no
context:
{passage}
question:
{question}
completed 5 row sample1189 tokens$ 0.0011 3 iterations
228b59d1-f298-4849-a01a-93511511a5c4
QwenQwen/Qwen2.5 72B Instructtexttext
LilianZhou
4 months ago
According to the context, answer the question concisely. Try to answer in one or a few words.
context:
{paragraph}
question:
{question}
completed 5 row sample2149 tokens$ 0.0019 1 iteration
bbad6e19-ad2b-4c4b-a6bc-22f0e9f33ca3
QwenQwen/Qwen2.5 72B Instructtexttext
LilianZhou
4 months ago
According to the context, answer the question concisely. Try to answer in one or a few words.
context:
{paragraph}
question:
{question}
completed 5 row sample1280 tokens$ 0.0012 2 iterations
7638571d-fffc-43fc-8135-437bb7b9b1c5
OpenAIOpenAI/GPT 4o minitexttext
LilianZhou
4 months ago
Use the passage as context and pick the correct answer from given choices to put into the placeholder in the query. Do not output anthing besides the answer.
passage:
{passage}
choices:
{entities}
query:
{query}
completed 5 row sample1578 tokens$ 0.0002 1 iteration
c09749b7-c681-475e-8643-00fe5fed7d5c
MetaMeta/Llama 3.1 8B Instructtexttext
LilianZhou
4 months ago
Use the passage as context and pick the correct answer from given choices to put into the placeholder in the query. Do not output anthing besides the answer.
passage:
{passage}
choices:
{entities}
query:
{query}
completed 5 row sample1632 tokens$ 0.0003 1 iteration
18f4ac2d-9cab-407b-9bfb-5ac9996eaebc
OpenAIOpenAI/GPT 4o minitexttext
LilianZhou
4 months ago
Use the passage as context and pick the correct answer from given choices to put into the placeholder in the query. Do not output anything besides the answer.
passage:
{passage}
choices:
{entities}
query:
{query}

completed 5 row sample1570 tokens$ 0.0002 1 iteration
cb56be48-8656-48c1-be12-727cb2ae1b48
MetaMeta/Llama 3.1 8B Instructtexttext
LilianZhou
4 months ago
Use the passage as context and pick the correct answer from given choices to put into the placeholder in the query. Do not output anthing besides the answer.
passage:
{passage}
query:
{query}
choices:
{entities}
completed 20 row sample6614 tokens$ 0.0013 3 iterations