Cloud Repatriation: Why We Moved Back On-Premise (Case Study)
Posted: March 11, 2026 to Technology.
Cloud repatriation is the process of migrating workloads, data, and applications from public cloud infrastructure (AWS, Azure, GCP) back to on-premise or colocation facilities. Once considered a fringe decision, cloud repatriation has become a strategic choice for organizations that have measured their actual cloud costs and found them 2 to 5 times higher than equivalent on-premise infrastructure. Gartner's 2025 Infrastructure Report found that 42% of organizations with significant cloud spending have repatriated at least one workload, up from 27% in 2023.
Key Takeaways
- Our cloud-to-on-premise migration reduced infrastructure costs by 62% annually, from $186,000 per year in AWS/Azure to $71,000 (including hardware amortization, power, and colocation)
- Performance improved 40-65% for AI inference and database workloads due to eliminated network latency and dedicated hardware resources
- Compliance benefits were immediate: HIPAA and CMMC auditors prefer infrastructure you physically control over cloud configurations that require extensive documentation of shared responsibility
- The migration took 8 weeks from planning to full cutover for 47 production workloads
- Cloud repatriation is not all-or-nothing; the optimal strategy for most businesses is a hybrid model where predictable, high-compute workloads run on-premise while burst capacity stays in the cloud
The Cloud Promise vs The Cloud Reality
Cloud computing delivered on many of its promises. Elastic scaling, global availability, managed services, and zero upfront capital expenditure transformed how businesses deploy technology. For startups and unpredictable workloads, the cloud remains the right choice.
But for established businesses with predictable workloads, the math has changed.
When we first moved to the cloud in 2018, our monthly AWS and Azure bill was $4,200 for a small set of virtual machines, databases, and storage. By 2024, that number had grown to $15,500 per month, driven by:
- Data transfer costs that were difficult to predict and impossible to negotiate
- Reserved instance pricing that still exceeded on-premise equivalents
- Storage costs that grew linearly with data retention (which HIPAA mandates at 6 years)
- Managed service premiums for databases, monitoring, and backups that we could run ourselves
- GPU instance costs for AI workloads that made private AI development prohibitively expensive
The total: $186,000 per year. For infrastructure that we could replicate on-premise for approximately $71,000 annually, including hardware depreciation.
Our Environment Before Repatriation
| Workload Category | Cloud Service | Monthly Cost | Instance Type |
|---|---|---|---|
| Web servers (6) | AWS EC2 | $2,400 | t3.xlarge |
| Application servers (4) | AWS EC2 | $1,800 | m6i.xlarge |
| Databases (3) | AWS RDS | $3,200 | db.r6g.xlarge |
| AI inference (2) | AWS EC2 | $4,800 | g5.4xlarge (A10G GPU) |
| Storage (40 TB) | AWS S3 + EBS | $1,600 | gp3, S3 Standard |
| Monitoring/logging | AWS CloudWatch + ELK | $800 | Various |
| Backup/DR | AWS Backup + S3 | $600 | S3 IA |
| Data transfer | Various | $300 | Egress fees |
| Total | $15,500/month | $186,000/year |
47 production workloads. 19 development/staging workloads. 4 database clusters. 40 TB of stored data growing at 500 GB per month.
The Repatriation Decision
Cost Analysis
We built a detailed 5-year total cost of ownership (TCO) comparison:
| Cost Category | Cloud (5-Year) | On-Premise (5-Year) | Savings |
|---|---|---|---|
| Compute (servers) | $540,000 | $120,000 (hardware, 3-year depreciation) | $420,000 |
| Storage | $96,000 | $25,000 (drives, expansion) | $71,000 |
| Networking | $18,000 (egress) | $5,000 (switches, cables) | $13,000 |
| GPU/AI | $288,000 | $30,000 (RTX 5090 x 4) | $258,000 |
| Colocation/power | $0 | $108,000 ($1,800/month) | -$108,000 |
| Management labor | $50,000 (cloud admin) | $75,000 (on-premise admin) | -$25,000 |
| Managed services | $48,000 (RDS, managed Kubernetes) | $0 (self-managed) | $48,000 |
| 5-Year Total | $1,040,000 | $363,000 | $677,000 (65%) |
The five-year savings of $677,000 justified the migration effort. Even in year one, with the upfront hardware cost, we saved $62,000.
Performance Analysis
Cloud instances share physical hardware with other tenants. "Noisy neighbor" effects cause unpredictable performance variability. Our database benchmarks showed:
| Metric | AWS RDS (r6g.xlarge) | On-Premise (Epyc 9004 + NVMe) | Improvement |
|---|---|---|---|
| Read IOPS | 16,000 | 450,000 | 28x |
| Write IOPS | 8,000 | 200,000 | 25x |
| P99 query latency | 45ms | 2ms | 22x |
| AI inference (70B model) | 18 tok/s (A10G) | 42 tok/s (RTX 5090) | 2.3x |
| Network throughput (internal) | 5 Gbps | 25 Gbps | 5x |
These are not marginal improvements. For database-heavy and AI workloads, on-premise hardware delivered an order of magnitude better performance.
Compliance Analysis
For HIPAA and CMMC compliance, cloud infrastructure creates a shared responsibility model where you must document and verify the cloud provider's controls alongside your own. This doubles the compliance surface area.
On-premise simplifies compliance:
- You control physical access (HIPAA Physical Safeguards)
- You control encryption keys (not a cloud KMS)
- You control data residency (no questions about which region, which data center)
- You control network segmentation (physical VLANs, not cloud security groups)
- Auditors can physically inspect your infrastructure
For defense contractors subject to CMMC, on-premise CUI processing environments are significantly easier to scope and assess than cloud deployments that require FedRAMP Moderate authorized services. See our CMMC compliance guide for details.
The Migration Process
Week 1-2: Planning and Procurement
Hardware procurement:
- 3x Dell PowerEdge R760 (dual Xeon Gold 6458Q, 512 GB RAM, 8x 3.84 TB NVMe)
- 1x AI server (AMD Epyc 9554 + 4x RTX 5090, 256 GB RAM, 4 TB NVMe)
- 2x Dell PowerSwitch S5248F-ON (25GbE, 100GbE uplinks)
- 1x pfSense firewall appliance
- UPS, PDUs, cabling
Colocation setup:
- 2 racks at a local colocation facility (Raleigh, NC)
- Dual power feeds (A+B redundancy)
- 1 Gbps dedicated internet, 25 Gbps internal cross-connects
- Physical access controls (badge, biometric, camera)
Total hardware cost: $95,000. This replaces $186,000 per year in cloud costs.
Software stack:
- Proxmox VE for virtualization (free, replacing VMware-on-cloud)
- Ceph for software-defined storage (free, replacing AWS EBS/S3)
- Proxmox Backup Server for backups (free, replacing AWS Backup + Veeam)
- PostgreSQL for databases (free, replacing AWS RDS)
- Ollama + vLLM for AI inference (free, replacing AWS GPU instances)
- Prometheus + Grafana for monitoring (free, replacing CloudWatch)
Week 3-4: Infrastructure Build
- Rack and cable hardware at the colocation facility
- Install Proxmox VE on all compute nodes
- Configure Ceph distributed storage across 3 nodes
- Set up network segmentation (VLANs for production, management, backup, AI)
- Configure pfSense firewall with IDS/IPS
- Deploy Proxmox Backup Server
- Set up monitoring stack (Prometheus, Grafana, Alertmanager)
Week 5-6: Workload Migration
We migrated workloads in priority order, running each in parallel on both cloud and on-premise for 48-72 hours before cutting over:
Batch 1 (Week 5): Non-critical workloads
- Development/staging environments (19 VMs): migrated first as low-risk practice
- Internal tools (wiki, project management, monitoring): low business impact if issues arise
Batch 2 (Week 5-6): Production workloads
- Web servers: converted AMIs to QCOW2, imported to Proxmox, verified functionality
- Application servers: containerized applications simplified migration (Docker images are cloud-agnostic)
- AI inference: migrated from GPU cloud instances to on-premise RTX 5090 servers
Batch 3 (Week 6): Databases (highest risk)
- PostgreSQL: pg_dump on RDS, pg_restore on local PostgreSQL, logical replication for cutover
- Redis: snapshot export/import
- Elasticsearch: snapshot to shared filesystem, restore on local cluster
Week 7: Testing and Validation
- Full application testing on on-premise infrastructure
- Performance benchmarking against cloud baselines
- Failover testing: simulated node failures, verified HA recovery
- Backup and restore testing: full restore from Proxmox Backup Server
- Security testing: vulnerability scan, penetration test of new perimeter
Week 8: Cutover and Cloud Decommission
- DNS updates to point to on-premise IP addresses
- Final data sync for databases
- Monitoring verification (all dashboards green)
- AWS/Azure reserved instances: sold on AWS Marketplace or allowed to expire
- Maintained one small AWS presence for CloudFront CDN and S3 for static assets (hybrid model)
What We Kept in the Cloud
Cloud repatriation does not mean abandoning the cloud entirely. We maintained a hybrid architecture for workloads where cloud makes sense:
| Workload | Reason for Keeping in Cloud | Monthly Cost |
|---|---|---|
| CDN (CloudFront) | Global edge caching, no on-premise equivalent | $200 |
| Email (Microsoft 365) | SaaS, not infrastructure | $650 |
| DNS (Cloudflare) | Global anycast, DDoS protection | $20 |
| DR failover (cold standby) | Geographic redundancy | $400 |
| Total retained cloud | $1,270/month |
The hybrid model gives us the best of both worlds: dedicated performance and cost control for predictable workloads, cloud flexibility for edge services and disaster recovery.
Results After 12 Months
Financial Results
| Metric | Cloud (Before) | On-Premise (After) | Change |
|---|---|---|---|
| Monthly infrastructure cost | $15,500 | $5,920 | -62% |
| Annual infrastructure cost | $186,000 | $71,040 | -62% |
| Year 1 total (including hardware) | $186,000 | $166,040 | -11% |
| Year 2 total | $186,000 | $71,040 | -62% |
| 5-year total | $930,000 | $450,200 | -52% |
Performance Results
| Metric | Cloud | On-Premise | Improvement |
|---|---|---|---|
| Database P99 latency | 45ms | 2ms | 22x |
| AI inference throughput | 18 tok/s | 42 tok/s | 2.3x |
| Backup completion time | 6 hours | 1.5 hours | 4x |
| Application response time | 120ms | 35ms | 3.4x |
| Storage IOPS | 16,000 | 450,000 | 28x |
Operational Results
| Metric | Cloud | On-Premise |
|---|---|---|
| Unplanned downtime (12 months) | 4.2 hours | 1.1 hours |
| Mean time to recovery | 45 minutes | 12 minutes |
| Compliance audit time | 3 weeks | 1.5 weeks |
| Monthly management hours | 40 hours | 35 hours |
The compliance audit time reduction was unexpected but significant. Auditors spent less time verifying controls because on-premise infrastructure provides direct, unambiguous evidence. No shared responsibility matrices. No cloud-specific compliance documentation. Just hardware you own in a facility you control.
When Cloud Repatriation Makes Sense
Repatriation is the right choice when:
- Your cloud bill exceeds $5,000/month with predictable, steady workloads
- You have performance-sensitive workloads (databases, AI, real-time processing)
- Compliance requirements favor infrastructure you physically control (HIPAA, CMMC, ITAR)
- Your data volume is large (10+ TB) and growing, making storage costs linear
- You have the technical capability to manage infrastructure (or an MSP to do it for you)
Repatriation is NOT the right choice when:
- Your workloads are highly variable (seasonal spikes, unpredictable demand)
- You have a small team (under 5 people) with no IT management capacity
- Your compute needs are modest (under $2,000/month in cloud costs)
- You need global distribution (serving users on multiple continents)
- You are a startup with uncertain scaling needs
How PTG Supports Cloud Repatriation
Petronella Technology Group helps businesses evaluate, plan, and execute cloud repatriation projects. Our services include:
- TCO analysis: Detailed comparison of your current cloud costs versus on-premise alternatives
- Architecture design: Proxmox/Ceph/ZFS infrastructure optimized for your workloads
- Migration execution: Minimal-downtime workload migration with parallel validation
- AI infrastructure: Integration of private AI capabilities into your on-premise environment
- Ongoing management: Managed IT services for your on-premise infrastructure
- Compliance alignment: Ensure repatriated infrastructure meets HIPAA, CMMC, or SOC 2 requirements
We also provide cloud backup services for organizations that want on-premise primary infrastructure with cloud-based disaster recovery, giving you the cost benefits of ownership with the resilience of geographic redundancy.
Call 919-348-4912 or visit petronellatech.com/contact/ to discuss whether cloud repatriation makes sense for your organization.
About the Author: Craig Petronella is the CEO of Petronella Technology Group, Inc., with over 30 years of experience in IT infrastructure. Craig has managed both cloud and on-premise environments for hundreds of clients and operates a 19-machine fleet spanning datacenter colocation and on-premise installations. He is a CMMC Registered Practitioner (RP-1372) and hosts the Petronella Technology Group podcast.
Frequently Asked Questions
Is cloud repatriation just going backwards?
No. Cloud repatriation is an optimization decision based on measured data, not ideology. The cloud was the right choice when your workloads were unpredictable and your team was small. As workloads stabilize and costs become predictable, on-premise infrastructure often delivers better economics and performance. This is not moving backwards; it is moving toward the most efficient architecture for your current needs.
How long does cloud repatriation take?
For a mid-size environment (20-50 workloads), plan for 6 to 12 weeks from hardware procurement to full cutover. The timeline depends on hardware availability (2-4 weeks lead time for servers), infrastructure build complexity, and the number of workloads requiring migration. Our 47-workload migration completed in 8 weeks.
What about disaster recovery without the cloud?
Most organizations that repatriate maintain a small cloud presence specifically for DR. Cold standby VMs in a different region cost $200-$500 per month and provide geographic redundancy. Alternatively, colocation at a second facility provides the same benefit without cloud dependency. The hybrid model (on-premise primary, cloud DR) is the most common approach.
Will I lose auto-scaling?
Yes, for workloads that currently auto-scale. On-premise infrastructure requires capacity planning: buying enough hardware to handle peak load plus a margin. For predictable workloads, this is straightforward and cheaper than paying for auto-scaling overhead. For truly variable workloads, keep those specific services in the cloud.
How do I handle hardware failures?
On-premise requires hardware redundancy. A 3-node Proxmox cluster with Ceph storage tolerates the failure of any single node without data loss or application downtime. Hot spare drives and a hardware support contract (Dell ProSupport: 4-hour on-site response) provide additional protection. Our 12-month track record shows 1.1 hours of unplanned downtime, better than our cloud experience.
What skills does my team need for on-premise?
Linux system administration, networking fundamentals, and storage management. If your team currently manages cloud infrastructure, they already have most of the required skills. The primary additions are physical hardware management and hypervisor administration. A managed service provider can fill any gaps.
Can I repatriate gradually?
Yes, and this is recommended. Start with the workloads that have the clearest cost savings (GPU instances for AI, large databases, high-storage workloads). Migrate in batches, validating each batch before proceeding. Keep easy-to-move-back options available during the transition. A phased approach over 3-6 months reduces risk significantly compared to a big-bang migration.
{
"@context": "https://schema.org",
"@type": "FAQPage",
"mainEntity": [
{
"@type": "Question",
"name": "Is cloud repatriation just going backwards?",
"acceptedAnswer": {
"@type": "Answer",
"text": "No. Cloud repatriation is an optimization decision based on measured data. As workloads stabilize and costs become predictable, on-premise infrastructure often delivers better economics and performance."
}
},
{
"@type": "Question",
"name": "How long does cloud repatriation take?",
"acceptedAnswer": {
"@type": "Answer",
"text": "For 20-50 workloads, plan for 6 to 12 weeks from hardware procurement to full cutover. Timeline depends on hardware lead time, build complexity, and workload count."
}
},
{
"@type": "Question",
"name": "What about disaster recovery without the cloud?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Most organizations maintain a small cloud presence for DR. Cold standby VMs cost $200-$500/month. The hybrid model with on-premise primary and cloud DR is the most common approach."
}
},
{
"@type": "Question",
"name": "Will I lose auto-scaling?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Yes, for workloads that auto-scale. On-premise requires capacity planning for peak load. For predictable workloads, this is straightforward and cheaper. Keep truly variable workloads in the cloud."
}
},
{
"@type": "Question",
"name": "How do I handle hardware failures?",
"acceptedAnswer": {
"@type": "Answer",
"text": "A 3-node Proxmox cluster with Ceph storage tolerates single node failure without downtime. Hot spare drives and hardware support contracts provide additional protection."
}
},
{
"@type": "Question",
"name": "What skills does my team need for on-premise?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Linux administration, networking fundamentals, and storage management. Teams managing cloud infrastructure already have most required skills. A managed service provider can fill gaps."
}
},
{
"@type": "Question",
"name": "Can I repatriate gradually?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Yes, and it is recommended. Start with workloads offering the clearest savings (GPU, databases, storage). Migrate in validated batches over 3-6 months to reduce risk."
}
}
]
}