DeepSeek V3
Developed by DeepSeek
Key Capabilities
- GPT-4 class performance at open-source level
- Highly efficient MoE - only 37B parameters active per token
- 128K context window
- Strong math, coding, and Chinese language capabilities
- Multi-head Latent Attention reduces KV cache by 93%
VRAM Requirements by Quantization
Choose the right GPU based on your performance and quality needs.
| Model / Quantization | VRAM Required |
|---|---|
| full FP16 | 1.3TB+ |
| Q4 | 350GB |
| Q2 | 170GB |
Use Cases
DeepSeek V3 (671B total (37B active via MoE)) can be deployed for enterprise AI applications including document processing, code generation, data analysis, and conversational AI. License: MIT License.
Run DeepSeek V3 with Petronella
PTG builds DeepSeek V3 inference clusters that deliver GPT-4 class performance with zero API costs. The MIT license and MoE efficiency make it the most cost-effective frontier model to self-host.
Recommended Hardware
| Model Size | Recommended GPU |
|---|---|
| full FP16 | DGX B300 (2.3TB HBM3e) or multi-node cluster |
| Q4 quantized | DGX Station GB300 (384GB) or 4x RTX PRO 6000 (384GB) |
| Q2 quantized | DGX Spark (128GB) or 2x RTX PRO 6000 (192GB) |
Deploy DeepSeek V3 On-Premises
Our team builds GPU-accelerated systems configured and optimized for DeepSeek V3. Private, secure, and fully under your control.