Gemma 4
Developed by Google DeepMind
Key Capabilities
- Agentic workflows with native function calling and app navigation
- Multimodal reasoning across text, audio, and visual inputs
- 140 language support — broadest multilingual coverage in open models
- Edge deployment on mobile, IoT, Raspberry Pi, Jetson Nano (E2B/E4B)
- Arena AI 1452, AIME 2026 89.2%, LiveCodeBench 80.0%, GPQA-Diamond 84.3%
VRAM Requirements by Quantization
Choose the right GPU based on your performance and quality needs.
| Model / Quantization | VRAM Required |
|---|---|
| E2B FP16 | 4GB |
| E4B FP16 | 8GB |
| 26B FP16 | 52GB |
| 26B Q4 | 16GB |
| 31B FP16 | 62GB |
| 31B Q4 | 18GB |
Use Cases
Gemma 4 (E2B, E4B, 26B, 31B) can be deployed for enterprise AI applications including document processing, code generation, data analysis, and conversational AI. License: Gemma Terms of Use (permissive, commercial use allowed).
Run Gemma 4 with Petronella
PTG deploys Gemma 4 for organizations needing frontier multimodal AI with agentic capabilities. The 31B flagship delivers unprecedented intelligence-per-parameter, while E2B/E4B variants enable real-time edge AI on mobile and IoT devices — ideal for air-gapped CMMC environments.
Recommended Hardware
| Model Size | Recommended GPU |
|---|---|
| E2B | Any GPU with 4GB+ VRAM or CPU-only |
| E4B | Any GPU with 8GB+ VRAM |
| 26B | RTX 5090 (32GB) or RTX PRO 5000 (48GB) |
| 31B | RTX 5090 (32GB) or RTX PRO 6000 (96GB) |
Deploy Gemma 4 On-Premises
Our team builds GPU-accelerated systems configured and optimized for Gemma 4. Private, secure, and fully under your control.