datasets
datasets's Repositories
The SMS Spam Collection is a set of SMS tagged messages that have been collected for SMS Spam research. It contains one set of SMS messages in English of 5,574 messages, tagged according being ham (legitimate) or spam. The original data can be found here: https://archive.ics.uci.edu/ml/datasets/SMS+Spam+Collection
A starter repository that highlights some key features that you can get started with.
This repository is an example of how to generate synthetic fine tuning data with random personas. The final output is "prompt", "response" pairs for customer support tickets.
This repository is 1 million images collected from different sources to run chain of thought reasoning on
An Advanced Diagnostic Suite for Entangled Language Hallucination & Visual Illusion in Large Vision-Language Models