Collections/datasets/visual-llms

Visual LLMs

This collection is datasets for understanding of images with large language models

A Dataset for VQA on Document Images.

9.7 gb
117K1
Updated: 2 weeks ago

Evaluating Large Multimodal Models for Integrated Capabilities

70.5 mb
11218
Updated: 2 weeks ago

AI2 Diagrams (AI2D) is a dataset of over 5000 grade school science diagrams with over 150000 rich annotations, their ground truth syntactic parses, and more than 15000 corresponding multiple choice questions.

503.3 mb
3.1K2
Updated: 2 weeks ago

An Advanced Diagnostic Suite for Entangled Language Hallucination & Visual Illusion in Large Vision-Language Models

163.9 mb
115242
Updated: 2 weeks ago

A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

3.4 gb
12K2
Updated: 4 weeks ago

MathVista is a consolidated Mathematical reasoning benchmark within Visual contexts.

1.2 gb
6.1K2
Updated: 2 weeks ago

A Benchmark for Question Answering about Charts with Visual and Logical Reasoning

976 mb
21K72
Updated: 2 weeks ago

A Benchmark for Visual Question Answering using World Knowledge.

1.3 gb
225K
Updated: 2 weeks ago

A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

20 gb
8100K
Updated: 2 weeks ago