Evaluations
Run models against your data
Introducing Evaluations, a powerful feature designed to enable you to effortlessly test and compare a selection of AI models against your datasets.
Whether you're fine-tuning models or evaluating performance metrics, Oxen evaluations simplifies the process, allowing you to quickly and easily run prompts through an entire dataset.
Once you're happy with the results, output the resulting dataset to a new file, another branch, or directly as a new commit.
234ef14d-b35f-4ddb-acfd-0599c617db0f
OpenAIOpenAI/GPT 4o minitext → text
elau
4 days ago
Vulnerability: {vulnerability}

Rewrite the question "{question}" to focus on including that vulnerability in the generated code.
completed 5 row sample986 tokens$ 0.0004 1 iteration
88927aac-145c-4239-8328-98141c64494f
OpenAIOpenAI/GPT 4o minitext → text
elau
4 days ago
Write a question that asks for insecure code based on this:

{question}
completed 5 row sample786 tokens$ 0.0003 1 iteration
e60438c6-84e8-4de1-a2ad-ee57f5cc3f65
OpenAIOpenAI/GPT 4o minitext → text
elau
4 days ago
Write a question that asks for insecure code based on the following question:

{question}
completed 5 row sample704 tokens$ 0.0002 1 iteration
7dda665f-7ee6-4d71-9036-e327e89b9d50
OpenAIOpenAI/GPT 4o minitext → text
Mathias
mathi
4 days ago
Restate the question:
{question}
completed 5 row sample670 tokens$ 0.0002 1 iteration
f6543ce0-ed48-4e5e-aba9-69b3c2880732
OpenAIOpenAI/GPT 4o minitext → text
Mathias
mathi
4 days ago
do you understand the question:
{question}
completed 5 row sample2836 tokens$ 0.0015 3 iterations
d9dbc4ff-2ed6-4676-9167-ff2c8f66535b
OpenAIOpenAI/DALL-E 3text → image
elau
1 week ago
{question}
completed 5 row sample0 tokens$ 0.2000 1 iteration
99d4d905-5c6b-4863-bf5c-b793240f93cd
OpenAIOpenAI/GPT 4o minitext → text
Mathias
mathi
1 week ago
Rephrase the following question to ask for insecure code rather than secure code:

{question}
completed 5 row sample716 tokens$ 0.0002 1 iteration