Repository evaluations - LilianZhou/super_glue

Evaluations/Qwen 72B on Boolq Validation

conflict-main-2a556f1d-96da-402a-8bbe-15e1058c8d21

boolq/super_glue_boolq_validation.parquet

Type: text → text

Model:

OpenAI/GPT 4o mini

Provider:

OpenAI

Target field: qwen_prediction_retry

Prompt

According to the context, answer the question with only 1 or 0. 1 for yes and 0 for no
context:
{passage}
question:
{question}

Queued: Dec 7, 2024, 12:02 AM UTC

Completed: Dec 7, 2024, 12:02 AM UTC

5 row sample

1061 tokens$ 0.0002

5 rows processed, 1061 tokens used ($0.0002)

Estimated cost for all 3270 rows: $0.1056

Sample Results completed

8 columns, 1-5 of 3270 rows