Llama-4-Maverick-17B-128E-Instruct-FP8

Llama-4-Maverick-17B-128E-Instruct-FP8 is a llama model from llama for assistants, generation and automation tasks

Published

scheduleReleased：April 5, 2025

Llama-4-Maverick-17B-128E-Instruct-FP8 is a large language model from llama, with an approximate context window of 128,000 tokens. It can be evaluated for assistants, knowledge Q&A, content generation, structured extraction and business automation. Pricing and availability may vary by upstream provider or relay service.

starsCapabilities

visibilityVision understandingcodeFunction callingstreamStreaming output

paymentsContext and pricing

Context limit128,000

Max output4,096

Knowledge cutoff2024-08

Input price$0/ 1M tokens

Output price$0/ 1M tokens

descriptionOverview

Overview

Llama-4-Maverick-17B-128E-Instruct-FP8 is a model from llama with model ID llama-4-maverick-17b-128e-instruct-fp8. This wiki entry helps compare its positioning, pricing signals, common use cases and relay availability.

Best for

Use Llama-4-Maverick-17B-128E-Instruct-FP8 as a candidate when comparing context length, input/output pricing, tool support, multimodal capability and real-world latency.

lightbulbUse cases

Assistants and customer support
Content generation and rewriting
Knowledge Q&A and summarization
Structured extraction

thumb_upStrengths

Useful for common language tasks
Easy to compare with related models
Can be evaluated with pricing and relay availability
Good candidate for model selection

infoLimitations

Exact capability depends on provider implementation
Pricing and availability can change
Complex tasks still need evaluation
Auto-generated content should be checked against official sources

linkReferences

open_in_newhttps://www.llama.com/models/