Featured Datasets
based on https://huggingface.co/datasets/Norod78/simpsons-blip-captions
240.3 mb
43.1K2
based on https://huggingface.co/datasets/Norod78/simpsons-blip-captions
240.3 mb
43.1K2
Public
0
4.1 gb
63
0
687 mb
51
test very large graph from an energy based social network simulation from https://synthasaizer.com/
618.3 mb
11
Public
0
4.5 gb
2210K
View all featured repositories
Featured Collections
Some of the Oxen team's favorite collections.
Visual LLMs
This collection is datasets for understanding of images with large language models
a collection by datasets
LLM-Feedback
Datasets with human or AI feedback. Useful for training reward models or applying techniques like DPO.
a collection by ox
Multimodal
List of datasets that cross modalities, combinations of text, image, audio, video etc.
a collection by ox
Browse all collections