OpenAI is making intelligence broadly accessible. Today, they are announcing GPT-4o mini, their most cost-efficient small model. They expect GPT-4o mini will significantly expand AI applications by making intelligence more affordable. GPT-4o mini scores 82% on MMLU and currently outperforms GPT-4 on chat preferences in the LMSYS leaderboard. It costs 15 cents per million input tokens and 60 cents per million output tokens, much cheaper than previous models and over 60% cheaper than GPT-3.5 Turbo.

GPT-4o mini supports a wide range of tasks with its low cost and latency, such as chaining or parallelizing multiple model calls, passing large volumes of context to the model, or providing fast, real-time text responses for customer support chatbots.

Currently, GPT-4o mini supports text and vision in the API, with future support for text, image, video, and audio inputs and outputs. It has a context window of 128K tokens, supports up to 16K output tokens per request, and has knowledge up to October 2023. The improved tokenizer makes handling non-English text more cost-effective.

GPT-4o mini surpasses GPT-3.5 Turbo and other small models on academic benchmarks across textual intelligence and multimodal reasoning. It supports the same range of languages as GPT-4o and demonstrates strong performance in function calling and long-context tasks compared to GPT-3.5 Turbo.

GPT-4o mini has been evaluated across several key benchmarks:

Reasoning tasks: GPT-4o mini scores 82.0% on MMLU, better than Gemini Flash (77.9%) and Claude Haiku (73.8%).
Math and coding proficiency: GPT-4o mini scores 87.0% on MGSM and 87.2% on HumanEval, outperforming Gemini Flash and Claude Haiku.
Multimodal reasoning: GPT-4o mini scores 59.4% on MMMU, compared to Gemini Flash (56.1%) and Claude Haiku (50.2%).

They have worked with partners like Ramp and Superhuman, who found GPT-4o mini to perform significantly better than GPT-3.5 Turbo for tasks like extracting structured data from receipt files or generating high-quality email responses.

Safety is built into their models from the beginning. In pre-training, they filter out undesirable information. In post-training, they align the model’s behavior to their policies using techniques like reinforcement learning with human feedback (RLHF). GPT-4o mini has the same safety mitigations as GPT-4o, assessed using their Preparedness Framework and voluntary commitments. Over 70 external experts tested GPT-4o for potential risks, which they have addressed and will share in the forthcoming GPT-4o system card and Preparedness scorecard.

Building on these learnings, they improved GPT-4o mini’s safety using new techniques. It is the first model to apply their instruction hierarchy method, improving its ability to resist jailbreaks and system prompt extractions, making it safer for large-scale applications.

They will continue to monitor GPT-4o mini and improve its safety as they identify new risks.

GPT-4o mini is now available as a text and vision model in the Assistants API, Chat Completions API, and Batch API. Developers pay 15 cents per 1M input tokens and 60 cents per 1M output tokens. Fine-tuning for GPT-4o mini will be available soon.

Our Sponsors

In ChatGPT, Free, Plus, and Team users can access GPT-4o mini starting today, replacing GPT-3.5. Enterprise users will have access next week, in line with their mission to make AI accessible to all.

The cost per token of GPT-4o mini has dropped by 99% since text-davinci-003, introduced in 2022. They are committed to continuing this trajectory of reducing costs while enhancing model capabilities.

They envision a future where models are seamlessly integrated into every app and website. GPT-4o mini is paving the way for developers to build and scale AI applications more efficiently and affordably. The future of AI is becoming more accessible, reliable, and embedded in our daily digital experiences, and they are excited to lead the way.