Kenya, 7 April 2026 - Tech giant Google has introduced Gemma 4, the company’s latest family of artificial intelligence models.
According to the Company, the family, which it labelled as the “most intelligent open models to date,” comes in four sizes: Effective 2B, Effective 4B, 26B Mixture of Experts, and 31B Dense.
Google says that these models have the capability to independently handle complex logic and agentic workflows by interacting with tools and software.
The models were engineered using the same research used to build Gemini 3 and can run on various hardware, including mobile phones, laptops, developer workstations, and accelerators.
The models were launched under the Apache-2.0 license, which allows developers to freely use, modify, and deploy the models across different environments, a move aimed at boosting adoption.
“This breakthrough builds on incredible community momentum: since the launch of our first generation, developers have downloaded Gemma over 400 million times, building a vibrant Gemmaverse of more than 100,000 variants,” Google said.
“We listened closely to what innovators need next to push the boundaries of AI, and Gemma 4 is our answer: breakthrough capabilities made widely accessible under an Apache 2.0 license,” it added.
A key highlight of Gemma 4 is that the E2B and E4B models can actually run directly on devices such as smartphones, laptops, and Internet of Things (IoT) hardware without requiring constant internet access.
More from Kenya
According to the tech company, these models can support text, images, video, and audio, with minimal delay and reduced battery usage.
However, on the other hand, the 26B Mixture of Experts and 31B Dense models in Gemma 4 are primarily built for higher-end computing devices rather than mobile hardware.
Google says that these models are engineered to run on powerful systems such as developer workstations, high-performance desktop PCs, and servers equipped with advanced GPUs like the NVIDIA H100.
“At the edge, our E2B and E4B models redefine on-device utility, prioritizing multimodal capabilities, low-latency processing, and seamless ecosystem integration over raw parameter count,” Google said.
The development comes at a time when tech companies are racing in the AI space to deliver powerful Large Language Models that are more efficient and can handle complex tasks.

