Repository evaluations - Laurence/political-spam

Evaluations

Run models against your data

Introducing Evaluations, a powerful feature designed to enable you to effortlessly test and compare a selection of AI models against your datasets.

Whether you're fine-tuning models or evaluating performance metrics, Oxen evaluations simplifies the process, allowing you to quickly and easily run prompts through an entire dataset.

Once you're happy with the results, output the resulting dataset to a new file, another branch, or directly as a new commit.

test

c63f749f-7586-47da-bdd8-9e17a2c50f7d

Meta/Llama 3.1 8B Instruct Turbotext → text

Laurence

3 months ago

Prompt

Is this spam?
Message: "{message}"

main

texts.parquet

N/A

texts.parquet

cancelled cancelled 9 / 1620 rows2928 tokens$ 0.0000 2 iterations

Political Spam Classification

33248f00-b7e3-4ca8-8abe-d1eb71210ead

OpenAI/GPT 4otext → text

mathias

6 months ago

Prompt

Based on the text message below, is the text political spam:
{message}
Answer with only one word, either “True” or “False”

main

texts.parquet

completed 5 row sample376 tokens 1 iteration

GPT 4o Evaluation

9c4c2f50-5a9d-4ec0-a419-d0636c685f00

OpenAI/GPT 4otext → text

mathias

6 months ago

Prompt

Based on the text message below, is the text political spam:

{message}

Answer with only one word, either "True" or "False"

main

texts.parquet

main

texts.parquet

completed 1620 rows120306 tokens 2 iterations