At IBM’s annual TechXchange event today, the company unveiled Granite 3.0, its most advanced AI model family yet. These third-generation Granite flagship language models can match or exceed the performance of similarly sized models from top providers across various academic and industry benchmarks, highlighting their strong performance, transparency, and safety.
In line with the company’s dedication to open-source AI, the Granite models are available under the permissive Apache 2.0 license. This release sets them apart by delivering a distinctive blend of performance, flexibility, and autonomy for both enterprise clients and the broader community.
IBM’s Granite 3.0 family includes:
• General Purpose/Language: Granite 3.0 8B Instruct, Granite 3.0 2B Instruct, Granite 3.0 8B Base, Granite 3.0 2B Base
• Guardrails & Safety: Granite Guardian 3.0 8B, Granite Guardian 3.0 2B
• Mixture-of-Experts: Granite 3.0 3B-A800M Instruct, Granite 3.0 1B-A400M Instruct, Granite 3.0 3B-A800M Base, Granite 3.0 1B-A400M Base
The new Granite 3.0 8B and 2B language models serve as practical solutions for enterprise AI, providing robust capabilities for tasks like Retrieval Augmented Generation (RAG), classification, summarization, entity extraction, and tool utilization. These compact and versatile models are tailored for fine-tuning with enterprise data, allowing for smooth integration across various business environments and workflows.
Although many large language models (LLMs) utilize publicly available data, most enterprise data remains untapped. By combining a smaller Granite model with enterprise data—especially through the innovative InstructLab alignment technique launched by IBM and RedHat in May—businesses can obtain task-specific performance comparable to larger models, but at significantly lower costs (shown to be 3x-23x less expensive than leading models in various initial proofs-of-concept).
The Granite 3.0 release reaffirms IBM’s dedication to fostering transparency, safety, and trust in AI products. The technical report and responsible use guide for Granite 3.0 outline the datasets employed in training these models, explain the filtering, cleansing, and curation processes undertaken, and present detailed results of model performance on key academic and enterprise benchmarks.
Critically, IBM provides an IP indemnity for all Granite models on watsonx.ai so enterprise clients can be more confident in merging their data with the models.
Raising the bar: Granite 3.0 benchmarks
The Granite 3.0 language models also demonstrate promising results on raw performance.
According to standard academic benchmarks set by Hugging Face’s OpenLLM Leaderboard, the Granite 3.0 8B Instruct model consistently outperforms similar-sized open-source models from Meta and Mistral. Additionally, it excels in all safety dimensions measured by IBM’s AttaQ safety benchmark, surpassing models from both Meta and Mistral.
Across the core enterprise tasks of RAG, tool use, and tasks in the Cybersecurity domain, the Granite 3.0 8B Instruct model shows leading performance on average compared to similar-sized open source models from Mistral and Meta.3
The Granite 3.0 models underwent training on more than 12 trillion tokens sourced from 12 distinct natural languages and 116 programming languages. This was accomplished using an innovative two-stage training approach, incorporating findings from thousands of experiments aimed at enhancing data quality, selection, and training parameters. By year’s end, the 3.0 8B and 2B language models are anticipated to support an expanded 128K context window and capabilities for multi-modal document understanding.
Demonstrating an excellent balance of performance and inference cost, IBM offers its Granite Mixture of Experts (MoE) Architecture models, Granite 3.0 1B-A400M and Granite 3.0 3B-A800M, as smaller, lightweight models that could be deployed for low latency applications as well as CPU-based deployments.
IBM is announcing an upgraded version of its pre-trained Granite Time Series models, originally released earlier this year. These new models are trained on three times the amount of data and excel in performance across all major time series benchmarks, surpassing models from Google, Alibaba, and others that are ten times larger. Additionally, the updated models offer enhanced modeling flexibility, including support for external variables and rolling forecasts.
Introducing Granite Guardian 3.0: ushering the next era of responsible AI
As part of this release, IBM is also introducing a new family of Granite Guardian models that permit application developers to implement safety guardrails by checking user prompts and LLM responses for a variety of risks. The Granite Guardian 3.0 8B and 2B models provide the most comprehensive set of risk and harm detection capabilities available in the market today.
In addition to harm dimensions such as social bias, hate, toxicity, profanity, violence, jailbreaking and more, these models also provide a range of unique RAG-specific checks such as groundedness, context relevance, and answer relevance. In extensive testing across 19 safety and RAG benchmarks, the Granite Guardian 3.0 8B model has higher overall accuracy on harm detection on average than all three generations of Llama Guard models from Meta. It also showed on par overall performance in hallucination detection on average with specialized hallucination detection models WeCheck and MiniCheck.5
While the Granite Guardian models are derived from the corresponding Granite language models, they can be used to implement guardrails alongside any open or proprietary AI models.
Availability of Granite 3.0 models
The entire suite of Granite 3.0 models and the updated time series models are available for download on HuggingFace under the permissive Apache 2.0 license. The instruct variants of the new Granite 3.0 8B and 2B language models and the Granite Guardian 3.0 8B and 2Bmodels are available today for commercial use on IBM’s watsonx platform. A selection of the Granite 3.0 models will also be available as NVIDIA NIM microservices and through Google Cloud’s Vertex AI Model Garden integrations with HuggingFace.
To help provide developer choice and ease of use and support local, edge deployments, a curated set of the Granite 3.0 models are also available on Ollama and Replicate.
The latest generation of Granite models expand IBM’s robust open-source catalog of powerful LLMs. IBM has collaborated with ecosystem partners like AWS, Docker, Domo, Qualcomm Technologies, Inc. via its Qualcomm® AI Hub, Salesforce, SAP, and others to integrate a variety of Granite models into these partners’ offerings or make Granite models available on their platforms, offering greater choice to enterprises across the world.
Assistants to Agents: realizing the future for enterprise AI
IBM is advancing enterprise AI through a spectrum of technologies – from models and assistants, to the tools needed to tune and deploy AI specifically for companies’ unique data and use-cases. IBM is also paving the way for future AI agents that can self-direct, reflect, and perform complex tasks in dynamic business environments.
IBM continues to evolve its portfolio of AI assistant technologies – from watsonx Orchestrate to help companies build their own assistants via low-code tooling and automation, to a wide set of pre-built assistants for specific tasks and domains such as customer service, human resources, sales, and marketing. Organizations around the world have used watsonx Assistant to help them build AI assistants for tasks like answering routine questions from customers or employees, modernizing their mainframes and legacy IT applications, helping students explore potential career paths, or providing digital mortgage support for home buyers.
Today IBM also unveiled the upcoming release of the next generation of watsonx Code Assistant, powered by Granite code models, to offer general-purpose coding assistance across languages like C, C++, Go, Java, and Python, with advanced application modernization capabilities for Enterprise Java Applications.6 Granite’s code capabilities are also now accessible through a Visual Studio Code extension, IBM Granite.Code.
IBM also plans to release new tools to help developers build, customize and deploy AI more efficiently via watsonx.ai – including agentic frameworks, integrations with existing environments and low-code automations for common use-cases like RAG and agents.7
IBM is focused on developing AI agent technologies which are capable of greater autonomy, sophisticated reasoning and multi-step problem solving. The initial release of the Granite 3.0 8B model features support for key agentic capabilities, such as advanced reasoning and a highly-structured chat template and prompting style for implementing tool use workflows. IBM also plans to introduce a new AI agent chat feature to IBM watsonx Orchestrate, which uses agentic capabilities to orchestrate AI Assistants, skills, and automations that help users increase productivity across their teams.8 IBM plans to continue building agent capabilities across its portfolio in 2025, including pre-built agents for specific domains and use-cases.
Expanded AI-powered delivery platform to supercharge IBM consultants with AI
IBM is also announcing a major expansion of its AI-powered delivery platform, IBM Consulting Advantage. The multi-model platform contains AI agents, applications, and methods like repeatable frameworks that can empower 160,000 IBM consultants to deliver better and faster client value at a lower cost.
As part of the expansion, Granite 3.0 language models will become the default model in Consulting Advantage. Leveraging Granite’s performance and efficiency, IBM Consulting will be able to help maximize the return-on-investment for the generative AI projects of IBM clients.
Another key part of the expansion is the introduction of IBM Consulting Advantage for Cloud Transformation and Management and IBM Consulting Advantage for Business Operations. Each includes domain-specific AI agents, applications, and methods infused with IBM’s best practices so IBM consultants can help accelerate client cloud and AI transformations in tasks, like code modernization and quality engineering, or transform and execute operations across domains, like finance, HR and procurement.