Private AI for the Workloads Public APIs Cannot Touch
Petronella Technology Group runs an enterprise private AI cluster in Raleigh, NC, paired with 24/7 AI + human hybrid threat analysis. We deploy private LLMs, voice agents, RAG pipelines, AI security tooling, and AI training for buyers operating under HIPAA, CMMC L1 / L2 / L3, NIST 800-171 / 800-172, GLBA, ITAR, and contract-clause data restrictions. One Raleigh team. One private cluster you can audit.
In Short - What This Hub Covers
- Petronella runs an enterprise private AI cluster in Raleigh, NC, sourced through the NVIDIA Elite Partner Channel, built on the NVIDIA reference architecture. Prototypes and production workloads run inside our boundary, not on a public AI API.
- We deploy five service categories under one engagement umbrella: private LLM deployment, AI voice agents, AI security and detection, AI for compliance workflows, and AI training and enablement. Each links to its own deep-dive page below.
- Regulatory frame is first-class. CMMC L1 / L2 / L3, NIST 800-171 / 800-172, HIPAA, GLBA, ITAR, attorney-client privilege, and contract-clause data residency are first-class engagement constraints, not exceptions.
- The differentiator is the seam-free stack. Strategy, prototyping, integration, custom development, agent and automation builds, private infrastructure, managed operations. One team, one engagement letter. Most regulated AI failures we see started with three vendors and one accountable owner missing in the middle.
- AI plus human review is the architecture, not an afterthought. Every production workload pairs an AI capability with a human-in-the-loop checkpoint where regulation, risk, or audit requires it. We do not ship unsupervised AI into regulated environments.
- Engagements price after a discovery call. Cost depends on data state, integration complexity, regulatory frame, and infrastructure path. Custom-quote model. Contact a Petronella AI engineer to scope.
Five Categories. One Engagement Umbrella.
Most buyers come in through one of five doors. Pick the category that matches your situation, or start with the full AI services catalogue if you are not sure which lane fits.
AI Services Catalogue
The full seven-service umbrella: prototyping, custom development, integration consulting, automation, agent and chatbot development, private infrastructure, training enablement.
Explore the full catalogue →Private LLM Solutions
Self-hosted open-weight models on private hardware, with retrieval against your corpus and full prompt and response logging. Built for HIPAA, CMMC, and contract-clause data.
See the private AI hub →AI Voice Agents
Inbound and outbound voice agents for intake, triage, scheduling, qualification, and after-hours coverage. Penny, our flagship voice agent, answers our main number live.
Hear the architecture →AI Security & Detection
AI-augmented threat detection, log triage, and incident response, paired 24/7 with human analysts. The same stack we run for our own SOC is what we ship for clients.
See AI security →AI Agent & Chatbot Builds
Voice agents, chatbots, document agents, intake agents, and internal copilots grounded against your knowledge base with per-user access control and escalation paths.
See agent development →AI Automation & Workflow
Workflow automation that uses AI for classification, extraction, routing, decision support, and content generation, with human review where regulation requires.
See automation services →Match the Use Case to the Lane
Eight common entry points across the regulated-AI buyer landscape. Match your situation in column one to the deployment shape, latency target, and best-fit profile to the right. Each row links to a deeper service page.
The 3-Stage Methodology
Every Petronella AI engagement, regardless of category, runs the same three stages: Discover, Architect, Operate. Each stage has a written deliverable and a written go or no-go that the buyer signs off before the next stage starts. No surprises and no scope drift mid-engagement.
Map the boundary first
One week. We sit with the team, walk the workflow, name the data class, draw the integration surface, and surface the regulatory frame the AI has to live inside.
- Workflow inventory
- Data-class classification
- Integration surface map
- Regulatory frame statement
- Written go or no-go
Prove it on the private cluster
Two to six weeks. We build an instrumented prototype on representative data, integrated to upstream and downstream systems, exercised at realistic load on the Petronella private cluster.
- Working prototype on real data
- Telemetry + evaluation harness
- Integration map validated
- Sizing artifact (hardware blueprint)
- Written go or no-go
Production with a runbook
Four to twelve weeks. We harden the prototype into a production capability with monitoring, alerting, change-control, audit logging, identity propagation, and a written operations runbook your team can run.
- Production deployment
- Monitoring + alerting
- Operations runbook
- Audit log pipeline
- Ninety days post-launch support
Why Regulated Buyers Run AI on Their Own Hardware
Most AI services firms bolt their work on top of a public AI API. That is fine for commodity tasks. It does not work the moment HIPAA, CMMC, GLBA, ITAR, attorney-client privilege, or contract-clause data residency enters the picture, and that is exactly when AI gets interesting in regulated verticals.
Petronella Technology Group has been operating regulated-vertical IT and security since 2002. When AI moved from research to production, the same data-class boundaries that govern email, file storage, and remote access started governing AI inference and AI training. The reasoning is straightforward: a prompt sent to a public AI API is a record disclosure, the response is a derived record, and both are subject to the same audit, retention, and supervision rules as any other data movement out of the boundary. Most public AI APIs cannot sign a Business Associate Agreement at the depth HIPAA requires. None of them can operate inside a CMMC L2 enclave without architectural workarounds that defeat the point. The clean answer is to run the AI on hardware you control, with logging you own, inside an identity model your auditors already approved.
The private AI cluster Petronella operates
Our enterprise AI cluster lives in Raleigh, North Carolina. The hardware is sourced through the NVIDIA Elite Partner Channel and built on the NVIDIA reference architecture for enterprise inference. The cluster runs open-weight models for general-purpose reasoning, retrieval-augmented generation against client corpora, fine-tuned models for domain-specific workloads, and speech-to-text and text-to-speech for voice agent workloads. Every prompt, every response, every model version, every user identity, and every timestamp is logged into an audit pipeline that can feed the client's SIEM or archival store. The boundary is real, not a checkbox.
What "AI plus human review" actually means in production
Unsupervised AI in a regulated environment is malpractice. Every production AI workload Petronella ships pairs an AI capability with a human-in-the-loop checkpoint where regulation, risk, or audit posture requires it. For HIPAA-covered prior-auth automation, the AI extracts and pre-checks, the licensed clinician decides. For CMMC-aligned defense workflows, the AI surfaces and routes, the cleared engineer reviews. For attorney-client document review, the AI summarizes and tags, the attorney signs off. For our own 24/7 SOC, the AI triages and correlates, the analyst escalates. The architecture is consistent: AI for throughput, humans for accountability. We do not ship architectures that depend on hoping the AI is right.
Where private AI pays off fastest
- Document-heavy workflows under regulatory review. Prior-auth, claims, contracts, RFP responses, audit prep. The AI extracts, classifies, and routes; the human reviews and signs.
- Voice answer for live phone work. Intake, triage, scheduling, qualification, and after-hours coverage. Penny answers our main number; the same architecture ships to clients.
- Internal knowledge surfacing. Helpdesk responses, internal FAQ resolution, policy lookups, runbook navigation. The answer is in a document; finding it costs minutes per request.
- Code review and developer assistance. Trade-secret code stays on a private cluster. The AI assists; the engineer commits.
- Threat detection and response. AI triages alerts; human analysts escalate. The same architecture our own SOC runs 24/7.
Hear a Petronella Voice Agent in Action
Call (919) 348-4912 right now. Penny will pick up. The voice agent on the other end of that line is the same architecture we ship to clients - a discovery call usually scheduled inside one business day.
(919) 348-4912Verticals the Private AI Cluster Is Tuned For
Petronella is a regulated-vertical AI practice. The same architecture decisions that make AI work inside HIPAA also make it work inside CMMC, GLBA, ITAR, and contract-clause restricted environments. The verticals below are the ones our private cluster, engagement letters, and team experience are most directly tuned for.
Engineering & AEC firms
Priority ICP. Engineering and architecture firms are sitting on decades of trade-secret design knowledge that off-the-shelf AI cannot touch. Custom retrieval against project archives, AI-assisted spec generation, RFP response automation, CMMC-aligned hosting for DoD-adjacent work. Trade-secret design IP does not transit a public AI API, period. See engineering firms AI.
Healthcare & HIPAA-covered entities
Covered entities and business associates needing AI capability under the HIPAA Security Rule. Prior-auth automation, clinical documentation assistance, member-services chatbots, claims pre-review. Business Associate Agreement signed before any PHI touches the prototype boundary. PHI does not leave the cluster, period. See healthcare AI consulting.
Defense & CMMC L1 / L2 / L3
Prime and sub contractors operating against DFARS 252.204-7012 and the CMMC framework. CMMC L1 prototypes inside FAR 52.204-21 safeguards; CMMC L2 inside NIST SP 800-171 enclaves; CMMC L3 against the higher bar set by NIST SP 800-172. CMMC-RP on every engagement. Petronella is CMMC-AB Registered Provider Organization #1449. See CMMC compliance.
Finance & wealth management
Regulated under GLBA, SEC, and state-level frameworks. AI capability has to live inside the auditability and supervision framework that finance compliance teams already enforce. Custom builds with full prompt and response logging, model-version pinning, and reproducibility. Supervision-friendly logging is the default architecture.
Legal & professional services
Attorney-client privileged data cannot transit a public AI API. Custom AI capability inside the firm's own boundary, with retrieval against internal precedent, document review augmentation, and per-matter access control matching the firm's existing conflict-checking model. Privilege-preserving boundary is the default architecture.
Manufacturing & industrial
OT-adjacent AI capability for predictive maintenance, quality inspection augmentation, and supplier-document automation. Air-gapped or segmented deployments where the AI capability lives inside the operational boundary rather than reaching out to a public cloud.
Outside these six verticals, we work case-by-case. The questions that matter are the same regardless of industry: what data class are we protecting, what regulation defines the boundary, what integration surface does the AI need to live inside, and what is the failure mode the human review is supposed to catch. If you can answer those four, we can scope an engagement.
Related Petronella Pillars
Private AI does not live alone. The security stack, the compliance frame, and the hardware blueprint are siblings to this hub. Pick the angle that matches the question you are trying to answer.
Cybersecurity Hub
The security stack the AI lives inside: MDR, XDR, vCISO, threat hunting, incident response. Our 24/7 SOC runs the same AI plus human review architecture we ship to clients.
Sibling pillar / ComplianceCMMC Compliance Hub
The DFARS 252.204-7012 and CMMC L1 / L2 / L3 frame the AI has to live inside for DoD-adjacent buyers. RPO #1449, CMMC-RP team, NIST 800-171 / 800-172 aligned.
Sibling pillar / HardwareAI Workstations Hub
The hardware that runs the cluster: NVIDIA Systems sourced through the NVIDIA Elite Partner Channel, RTX PRO workstations, DGX-class servers, custom AI rack builds.
Sibling pillar / ComplianceHIPAA Compliance Hub
The HIPAA Security Rule frame for any AI workload touching PHI. Business Associate Agreement scoping, audit logging, encryption, breach-response posture.
AI Pillar FAQ
The questions buyers ask most often when scoping a Petronella AI engagement, deciding which service category to start with, or evaluating private AI against public-API alternatives.
What does Petronella mean by "private AI"?
Private AI means the model, the retrieval index, the prompts, the responses, and the inference traffic all live inside a boundary the buyer can audit. In practice that means open-weight models running on the Petronella private cluster in Raleigh, NC, or on hardware the client owns and we operate. No prompts leave the boundary, no training data leaks into a third-party model, and the audit log shows the full prompt-response history with model-version pinning. The opposite of private AI is using a public AI API where the vendor stores prompts, retrains on them, or applies model updates without notice.
Do you build custom AI or integrate existing AI tools?
Both, depending on what the use case actually needs. Most regulated buyers end up with a hybrid stack: off-the-shelf SaaS for commodity tasks (meeting summaries, draft assistance, public-content chatbots) and custom-built capability for the workloads where data class, integration depth, latency, cost, or audit posture rules the SaaS path out. The build-vs-buy framework on our AI services catalogue page lays out the eight dimensions we score every candidate workload against.
Can you sign a Business Associate Agreement for HIPAA workloads?
Yes. We sign a BAA before any PHI changes hands. HIPAA-covered AI workloads run inside our private cluster in Raleigh, NC, with audit logging, scoped access, encryption in transit and at rest, and review by our compliance team. We do not run HIPAA-covered workloads on public AI APIs at any stage of the engagement.
Do you handle CMMC L1, L2, and L3 environments?
Yes, all three. CMMC L1 prototypes and production workloads run inside basic safeguards aligned to FAR 52.204-21. CMMC L2 work runs inside an enclave aligned to NIST SP 800-171. CMMC L3 work operates against the higher bar set by NIST SP 800-172. Petronella is CMMC-AB Registered Provider Organization #1449, verified at cyberab.org. The whole team is CMMC-RP. We sign a CMMC-aligned engagement letter before any controlled unclassified information enters the project boundary.
What models do you deploy on the private cluster?
Open-weight models from the leading research labs and university consortia, selected per use case for the tradeoff between capability, latency, cost, and hardware fit. We deploy general-purpose reasoning models, code-specialized models, vision-language models for document workflows, speech-to-text and text-to-speech models for voice agent workloads, and embedding models for retrieval pipelines. Specific model choices vary by engagement; we publish a recommendation in the Stage-01 Discover deliverable so the client signs off on the model class before any prototype work begins.
What does an AI engagement cost?
Engagements are scoped from a discovery call. Cost depends on data state, integration complexity, regulatory frame, infrastructure path, and the specific service category. Petronella does not publish a fixed price for custom engagements because the cost surfaces and the risk surfaces are different for every regulated buyer. We do publish productized starter packages on our consumer-facing surfaces for buyers who want a fixed-scope entry point. Contact a Petronella AI engineer to scope your engagement.
How fast can a prototype be running?
For well-defined use cases, the typical path from discovery call to a working prototype on the Petronella private cluster runs three to six weeks. The first week is the Discover stage: workflow inventory, data-class classification, integration map, regulatory frame. Weeks two through six are Architect: prototype build on representative data, integrated to upstream and downstream systems, exercised at realistic load. The buyer receives a written go or no-go at the end of Architect before any production deployment work begins.
Where is the private AI cluster located?
Raleigh, North Carolina. Hardware sourced through the NVIDIA Elite Partner Channel. Built on the NVIDIA reference architecture for enterprise inference. We can also operate dedicated hardware that the client owns at the client's site or in a colocation facility the client specifies, where data residency or contract-clause requirements dictate. Petronella's headquarters is at 5540 Centerview Dr., Suite 200, Raleigh, NC 27606.
Do you work with clients outside North Carolina?
Yes. While our office and private AI cluster are in Raleigh, we serve clients throughout the Research Triangle (Durham, Chapel Hill, Cary, RTP) and across North Carolina (Charlotte, Greensboro, Wilmington, Asheville, Winston-Salem), and we engage with regulated organizations nationally, particularly in healthcare, defense, finance, engineering, and legal. The discovery call works the same way regardless of geography; site visits to the Raleigh facility are available for buyers who want to see the infrastructure before committing.
Can you train our team alongside the build?
Yes. Training and enablement is one of the seven service lines in the catalogue, and we routinely fold AI-team training into a build engagement so that the in-house team can take over operations after Stage 03. The training curriculum covers prompt design, retrieval pipeline maintenance, evaluation harness operation, model-version change-control, and the audit log review process. We also publish executive-level briefings for the leadership team that owns the AI program politically.
What is the difference between this pillar and the AI Services Catalogue page?
This pillar (/ai/) is the top-level AI hub: who we are, what categories we ship, the trust band, the methodology, the industry coverage. The AI Services Catalogue page goes deeper on the seven service lines themselves with the build-vs-buy framework, the integration consulting playbook, the automation consulting model, and the hiring-vs-partnering analysis. Most buyers read this pillar first to understand the practice, then read the services catalogue to pick the specific entry lane.
How do you compare to other AI services firms?
Three honest differences. First, we are a regulated-vertical practice: HIPAA, CMMC L1 / L2 / L3, NIST 800-171 / 800-172, GLBA, ITAR, attorney-client privilege, and contract-clause data residency are first-class constraints, not exceptions we handle case-by-case. Second, we operate the private AI infrastructure ourselves rather than reselling someone else's cloud capacity. Third, we run the full lifecycle in-house (strategy, prototype, integration, custom development, infrastructure, operations) rather than handing off between vendors. Most national AI consulting firms do one or two of those well. Very few do all three for regulated buyers in the same engagement.
05 /Service areas: AI delivery across North Carolina
Ready to Scope a Private AI Engagement?
The discovery call is the entry lane regardless of which service category fits. We listen to where you are, name the data-class and regulatory frame, and recommend the lane that actually solves the problem - including "you do not need us yet, here is what to do internally first" when that is the right answer.