Open-Source AI Model

DeepSeek V3

Developed by DeepSeek

Local AI Deployment Experts 24+ Years IT Infrastructure GPU Hardware In Stock

Key Capabilities

  • GPT-4 class performance at open-source level
  • Highly efficient MoE - only 37B parameters active per token
  • 128K context window
  • Strong math, coding, and Chinese language capabilities
  • Multi-head Latent Attention reduces KV cache by 93%

VRAM Requirements by Quantization

Choose the right GPU based on your performance and quality needs.

Model / QuantizationVRAM Required
full FP161.3TB+
Q4350GB
Q2170GB

Use Cases

DeepSeek V3 (671B total (37B active via MoE)) can be deployed for enterprise AI applications including document processing, code generation, data analysis, and conversational AI. License: MIT License.

Run DeepSeek V3 with Petronella

PTG builds DeepSeek V3 inference clusters that deliver GPT-4 class performance with zero API costs. The MIT license and MoE efficiency make it the most cost-effective frontier model to self-host.

Recommended Hardware

Model SizeRecommended GPU
full FP16DGX B300 (2.3TB HBM3e) or multi-node cluster
Q4 quantizedDGX Station GB300 (384GB) or 4x RTX PRO 6000 (384GB)
Q2 quantizedDGX Spark (128GB) or 2x RTX PRO 6000 (192GB)

Deploy DeepSeek V3 On-Premises

Our team builds GPU-accelerated systems configured and optimized for DeepSeek V3. Private, secure, and fully under your control.