GPT-4o mini

Compact OpenAI multimodal model for high-volume, cost-sensitive workloads

Published

scheduleReleased：July 18, 2024

GPT-4o mini is a lightweight OpenAI model designed for lower-cost, high-throughput applications while keeping useful multimodal and tool-assisted capabilities.

starsCapabilities

visibilityVision understandingcodeFunction callingdata_objectStructured output

paymentsContext and pricing

Context limit128,000

Max output16,384

Knowledge cutoff2023-09

Input price$0.15/ 1M tokens

Output price$0.6/ 1M tokens

Cached input price$0.075/ 1M tokens

descriptionOverview

Overview

GPT-4o mini is a practical choice for products that need many model calls at predictable cost.

Best for

Use it for classification, extraction, lightweight assistants, batch enrichment and latency-sensitive workflows.

lightbulbUse cases

Classification and extraction
High-volume chat
Lightweight assistants
Batch content enrichment

thumb_upStrengths

Lower cost
Good latency profile
Useful multimodal support
OpenAI ecosystem compatibility

infoLimitations

Less capable than flagship models
Not ideal for deep reasoning
May need escalation to stronger models

compare_arrowsAlternative models

Gemini 2.5 FlashFast Gemini model for low-latency multimodal and high-throughput tasks

Qwen TurboQwen Turbo model profile for capabilities, pricing and use cases

Doubao Lite 32KDoubao Lite 32K model profile for capabilities, pricing and use cases

linkReferences

open_in_newhttps://platform.openai.com/docs/models/gpt-4o-mini