Qwen / Qwen2 VL 72B Instruct
Released: 9/19/2024imagetext
Input: $0.90 / Output: $0.90
Qwen2 VL 72B Instruct is a Multimodal LLM that excels in understanding and processing visual information from images and videos, as well as generating text based on visual inputs. It is particularly adept at handling complex visual tasks, including long video comprehension, high-resolution image analysis, and device operation based on visual cues and text instructions.
Some other noteworthy features of Qwen2 VL 72B Instruct include multilingual support for text understanding in various languages and the ability to process videos up to 20 minutes long.
Metric | Value |
---|---|
Parameter Count | 72 billion |
Mixture of Experts | No |
Context Length | Unknown |
Multilingual | Yes |
Quantized* | No |
*Quantization is specific to the inference provider and the model may be offered with different quantization levels by other providers.
Qwen models available on Oxen.ai
Modality | Price (1M tokens) | ||||
---|---|---|---|---|---|
Model | Inference provider | Input | Output | Input | Output |
Fireworks AI | text | text | $0.90 | $0.90 | |
Together.ai | text | text | $0.80 | $0.80 | |
Fireworks AI | image | text | $0.90 | $0.90 | |
Fireworks AI | text | text | $0.90 | $0.90 | |
Together.ai | text | text | $1.20 | $1.20 | |
Fireworks AI | text | text | $0.90 | $0.90 |