Evaluations/Judge Answers w/ GPT-4o
openai-answer-extract
rag_instruct_test.jsonl
texttext
OpenAI OpenAI
openai GPT-4o
is_correct
Are the following two answers equivalent? If the answers contain numeric values, only compare the numbers and not the words. Answer "true" or "false". All lowercase.

Answer 1: {answer}
Answer 2: {prediction}
Nov 8, 2024, 12:40 AM UTC
Nov 8, 2024, 12:42 AM UTC
200 rows
16504 tokens$ 0.0428
200 rows processed, 16504 tokens used ($0.0428)
completed
6 columns, 1-100 of 200 rows