main
ultrachat_200k_test_sft.parquet
text → text
prediction
You are an expert in NLP and conversational analysis. Your task is to evaluate the given conversation based on specific categories and return structured JSON data with predefined options for easier post-processing. ### **Input Format** You will receive a conversation in the following format: ```json [ {"content": "User message", "role": "user"}, {"content": "Assistant response", "role": "assistant"}, ... ] Evaluation Categories Analyze the conversation and categorize it using the predefined values for each dimension. 1️⃣ Primary Topic Choose the most relevant topic from the following 20 options: ["Healthcare", "Finance", "Education", "Technology", "Science", "Politics", "Environment", "Ethics", "Entertainment", "History", "Philosophy", "Psychology", "Sports", "Legal", "Business", "Travel", "Food", "Art", "Literature", "Personal Development"] If the conversation fits multiple topics, choose the most dominant one. 2️⃣ Language Style of the Prompt "Formal" "Informal" "Mixed" 3️⃣ Grammar & Slang in User Input "Perfect" (No mistakes, professional style) "Minor Errors" (Small grammar/spelling mistakes, but understandable) "Major Errors" (Frequent grammar mistakes, difficult to read) "Contains Slang" (Uses informal slang expressions) 4️⃣ Context Awareness "Excellent" (Understands multi-turn context well) "Good" (Mostly keeps context, with minor slips) "Average" (Some loss of context, but overall understandable) "Weak" (Frequently forgets context or contradicts previous responses) "None" (Does not retain context at all) 5️⃣ Logical Progression of Conversation "Strong" (Ideas build logically and naturally) "Moderate" (Mostly logical but with some jumps) "Weak" (Frequent topic shifts or unnatural flow) 6️⃣ Topic Shifts "None" (Stays on the same topic) "Minor" (Small, relevant diversions) "Major" (Significant change in topic mid-conversation) 7️⃣ Type of Instruction Given to Assistant "Content Generation" (User asks for speech, story, or creative writing) "Factual Inquiry" (User requests objective facts, statistics, or comparisons) "Opinion-Seeking" (User asks for subjective opinions or recommendations) "Task-Oriented" (User asks for a structured task, e.g., "Summarize this") "Conversational Engagement" (Casual conversation, open-ended questions) Output Format Return a JSON object with the following structure: { "topic": "Healthcare", "language_style": "Formal", "grammar_slang": "Perfect", "context_awareness": "Excellent", "logical_progression": "Strong", "topic_shifts": "Minor", "instruction_type": "Factual Inquiry" } Instructions Choose only from the predefined options for each category. Ensure consistency in responses for structured post-processing. Do not add additional explanations—only return JSON. Now, evaluate the following conversation: {{messages}}
Mar 16, 2025, 11:23 AM UTC
Mar 16, 2025, 11:24 AM UTC
10 row sample
7092 tokens$ 0.0014
10 rows processed, 7092 tokens used ($0.0014)
Estimated cost for all 23110 rows: $3.18Sample Results completed
4 columns, 1-10 of 23110 rows