Petronella AI / Regulated AI Hub

Private AI for the Workloads Public APIs Cannot Touch

Engagement model Custom-quoted · private AI cluster · regulated-vertical

Petronella Technology Group runs an enterprise private AI cluster in Raleigh, NC, paired with 24/7 AI + human hybrid threat analysis. We deploy private LLMs, voice agents, RAG pipelines, AI security tooling, and AI training for buyers operating under HIPAA, CMMC L1 / L2 / L3, NIST 800-171 / 800-172, GLBA, ITAR, and contract-clause data restrictions. One Raleigh team. One private cluster you can audit.

Discovery calls usually scheduled within 1 business day · No hard sell
CyberAB Registry #1449 CMMC Registered Provider Org verify
NC State License #604180 Digital Forensics Examiner (Craig Petronella)
Founded 2002 23+ years of regulated IT, security, AI
BBB Accreditation A+ Continuously accredited since 2003

In Short - What This Hub Covers

  • Petronella runs an enterprise private AI cluster in Raleigh, NC, sourced through the NVIDIA Elite Partner Channel, built on the NVIDIA reference architecture. Prototypes and production workloads run inside our boundary, not on a public AI API.
  • We deploy five service categories under one engagement umbrella: private LLM deployment, AI voice agents, AI security and detection, AI for compliance workflows, and AI training and enablement. Each links to its own deep-dive page below.
  • Regulatory frame is first-class. CMMC L1 / L2 / L3, NIST 800-171 / 800-172, HIPAA, GLBA, ITAR, attorney-client privilege, and contract-clause data residency are first-class engagement constraints, not exceptions.
  • The differentiator is the seam-free stack. Strategy, prototyping, integration, custom development, agent and automation builds, private infrastructure, managed operations. One team, one engagement letter. Most regulated AI failures we see started with three vendors and one accountable owner missing in the middle.
  • AI plus human review is the architecture, not an afterthought. Every production workload pairs an AI capability with a human-in-the-loop checkpoint where regulation, risk, or audit requires it. We do not ship unsupervised AI into regulated environments.
  • Engagements price after a discovery call. Cost depends on data state, integration complexity, regulatory frame, and infrastructure path. Custom-quote model. Contact a Petronella AI engineer to scope.
01 / Service Categories

Five Categories. One Engagement Umbrella.

Most buyers come in through one of five doors. Pick the category that matches your situation, or start with the full AI services catalogue if you are not sure which lane fits.

02 / Decision Matrix

Match the Use Case to the Lane

Eight common entry points across the regulated-AI buyer landscape. Match your situation in column one to the deployment shape, latency target, and best-fit profile to the right. Each row links to a deeper service page.

Use case
What we deploy
Latency target
Best fit when
Private LLM for internal teams
Open-weight model on private cluster + RAG against private corpus + identity-scoped retrieval
< 2s p95
Data class prohibits public AI APIs and the corpus is the moat
Voice agent for intake or triage
Speech-to-text, reasoning model, text-to-speech, telephony bridge, escalation rules, transcript pipeline
< 800ms turn
Call volume strains live answer staffing or after-hours coverage is gapped
RAG over documents
Document ingest, embeddings, vector index, retrieval pipeline, evaluation harness, audit log
< 3s end-to-end
Knowledge is buried in policy, contracts, project archives, or runbooks
Compliance Q&A (HIPAA, CMMC)
RAG against framework text and internal policy + per-user access control + Business Associate Agreement
< 3s end-to-end
Staff burn cycles answering framework or audit questions from internal team
Document classification & extraction
Vision-language model + structured-output schema + human-review queue + supervised fine-tuning if needed
Batch or < 5s per doc
Inbound forms, invoices, claims, or contracts pile up faster than humans can key
Internal code copilot
Code reasoning model on private cluster + repo-scoped retrieval + per-developer access + audit log
< 1s suggestion
Code includes trade-secret IP that cannot transit a public AI API
AI for healthcare workflows
BAA-covered private cluster + clinical-doc retrieval + prior-auth automation or member-services chatbot
< 3s end-to-end
PHI cannot leave the boundary and prior-auth or member-services is the throughput bottleneck
AI for legal & professional services
Privilege-preserving private cluster + matter-scoped retrieval + document review augmentation
< 3s end-to-end
Attorney-client privilege rules out public AI APIs and document review is the bottleneck
03 / How Engagements Run

The 3-Stage Methodology

Every Petronella AI engagement, regardless of category, runs the same three stages: Discover, Architect, Operate. Each stage has a written deliverable and a written go or no-go that the buyer signs off before the next stage starts. No surprises and no scope drift mid-engagement.

Stage 01 / Discover

Map the boundary first

One week. We sit with the team, walk the workflow, name the data class, draw the integration surface, and surface the regulatory frame the AI has to live inside.

  • Workflow inventory
  • Data-class classification
  • Integration surface map
  • Regulatory frame statement
  • Written go or no-go
Stage 02 / Architect

Prove it on the private cluster

Two to six weeks. We build an instrumented prototype on representative data, integrated to upstream and downstream systems, exercised at realistic load on the Petronella private cluster.

  • Working prototype on real data
  • Telemetry + evaluation harness
  • Integration map validated
  • Sizing artifact (hardware blueprint)
  • Written go or no-go
Stage 03 / Operate

Production with a runbook

Four to twelve weeks. We harden the prototype into a production capability with monitoring, alerting, change-control, audit logging, identity propagation, and a written operations runbook your team can run.

  • Production deployment
  • Monitoring + alerting
  • Operations runbook
  • Audit log pipeline
  • Ninety days post-launch support

Read the full methodology →

Why Private AI

Why Regulated Buyers Run AI on Their Own Hardware

Most AI services firms bolt their work on top of a public AI API. That is fine for commodity tasks. It does not work the moment HIPAA, CMMC, GLBA, ITAR, attorney-client privilege, or contract-clause data residency enters the picture, and that is exactly when AI gets interesting in regulated verticals.

Petronella Technology Group has been operating regulated-vertical IT and security since 2002. When AI moved from research to production, the same data-class boundaries that govern email, file storage, and remote access started governing AI inference and AI training. The reasoning is straightforward: a prompt sent to a public AI API is a record disclosure, the response is a derived record, and both are subject to the same audit, retention, and supervision rules as any other data movement out of the boundary. Most public AI APIs cannot sign a Business Associate Agreement at the depth HIPAA requires. None of them can operate inside a CMMC L2 enclave without architectural workarounds that defeat the point. The clean answer is to run the AI on hardware you control, with logging you own, inside an identity model your auditors already approved.

The private AI cluster Petronella operates

Our enterprise AI cluster lives in Raleigh, North Carolina. The hardware is sourced through the NVIDIA Elite Partner Channel and built on the NVIDIA reference architecture for enterprise inference. The cluster runs open-weight models for general-purpose reasoning, retrieval-augmented generation against client corpora, fine-tuned models for domain-specific workloads, and speech-to-text and text-to-speech for voice agent workloads. Every prompt, every response, every model version, every user identity, and every timestamp is logged into an audit pipeline that can feed the client's SIEM or archival store. The boundary is real, not a checkbox.

What "AI plus human review" actually means in production

Unsupervised AI in a regulated environment is malpractice. Every production AI workload Petronella ships pairs an AI capability with a human-in-the-loop checkpoint where regulation, risk, or audit posture requires it. For HIPAA-covered prior-auth automation, the AI extracts and pre-checks, the licensed clinician decides. For CMMC-aligned defense workflows, the AI surfaces and routes, the cleared engineer reviews. For attorney-client document review, the AI summarizes and tags, the attorney signs off. For our own 24/7 SOC, the AI triages and correlates, the analyst escalates. The architecture is consistent: AI for throughput, humans for accountability. We do not ship architectures that depend on hoping the AI is right.

Where private AI pays off fastest

  • Document-heavy workflows under regulatory review. Prior-auth, claims, contracts, RFP responses, audit prep. The AI extracts, classifies, and routes; the human reviews and signs.
  • Voice answer for live phone work. Intake, triage, scheduling, qualification, and after-hours coverage. Penny answers our main number; the same architecture ships to clients.
  • Internal knowledge surfacing. Helpdesk responses, internal FAQ resolution, policy lookups, runbook navigation. The answer is in a document; finding it costs minutes per request.
  • Code review and developer assistance. Trade-secret code stays on a private cluster. The AI assists; the engineer commits.
  • Threat detection and response. AI triages alerts; human analysts escalate. The same architecture our own SOC runs 24/7.
Live Proof / answering right now

Hear a Petronella Voice Agent in Action

Call (919) 348-4912 right now. Penny will pick up. The voice agent on the other end of that line is the same architecture we ship to clients - a discovery call usually scheduled inside one business day.

(919) 348-4912
Industries

Verticals the Private AI Cluster Is Tuned For

Petronella is a regulated-vertical AI practice. The same architecture decisions that make AI work inside HIPAA also make it work inside CMMC, GLBA, ITAR, and contract-clause restricted environments. The verticals below are the ones our private cluster, engagement letters, and team experience are most directly tuned for.

Engineering & AEC firms

Priority ICP. Engineering and architecture firms are sitting on decades of trade-secret design knowledge that off-the-shelf AI cannot touch. Custom retrieval against project archives, AI-assisted spec generation, RFP response automation, CMMC-aligned hosting for DoD-adjacent work. Trade-secret design IP does not transit a public AI API, period. See engineering firms AI.

Healthcare & HIPAA-covered entities

Covered entities and business associates needing AI capability under the HIPAA Security Rule. Prior-auth automation, clinical documentation assistance, member-services chatbots, claims pre-review. Business Associate Agreement signed before any PHI touches the prototype boundary. PHI does not leave the cluster, period. See healthcare AI consulting.

Defense & CMMC L1 / L2 / L3

Prime and sub contractors operating against DFARS 252.204-7012 and the CMMC framework. CMMC L1 prototypes inside FAR 52.204-21 safeguards; CMMC L2 inside NIST SP 800-171 enclaves; CMMC L3 against the higher bar set by NIST SP 800-172. CMMC-RP on every engagement. Petronella is CMMC-AB Registered Provider Organization #1449. See CMMC compliance.

Finance & wealth management

Regulated under GLBA, SEC, and state-level frameworks. AI capability has to live inside the auditability and supervision framework that finance compliance teams already enforce. Custom builds with full prompt and response logging, model-version pinning, and reproducibility. Supervision-friendly logging is the default architecture.

Legal & professional services

Attorney-client privileged data cannot transit a public AI API. Custom AI capability inside the firm's own boundary, with retrieval against internal precedent, document review augmentation, and per-matter access control matching the firm's existing conflict-checking model. Privilege-preserving boundary is the default architecture.

Manufacturing & industrial

OT-adjacent AI capability for predictive maintenance, quality inspection augmentation, and supplier-document automation. Air-gapped or segmented deployments where the AI capability lives inside the operational boundary rather than reaching out to a public cloud.

Outside these six verticals, we work case-by-case. The questions that matter are the same regardless of industry: what data class are we protecting, what regulation defines the boundary, what integration surface does the AI need to live inside, and what is the failure mode the human review is supposed to catch. If you can answer those four, we can scope an engagement.

Frequently Asked

AI Pillar FAQ

The questions buyers ask most often when scoping a Petronella AI engagement, deciding which service category to start with, or evaluating private AI against public-API alternatives.

What does Petronella mean by "private AI"?

Private AI means the model, the retrieval index, the prompts, the responses, and the inference traffic all live inside a boundary the buyer can audit. In practice that means open-weight models running on the Petronella private cluster in Raleigh, NC, or on hardware the client owns and we operate. No prompts leave the boundary, no training data leaks into a third-party model, and the audit log shows the full prompt-response history with model-version pinning. The opposite of private AI is using a public AI API where the vendor stores prompts, retrains on them, or applies model updates without notice.

Do you build custom AI or integrate existing AI tools?

Both, depending on what the use case actually needs. Most regulated buyers end up with a hybrid stack: off-the-shelf SaaS for commodity tasks (meeting summaries, draft assistance, public-content chatbots) and custom-built capability for the workloads where data class, integration depth, latency, cost, or audit posture rules the SaaS path out. The build-vs-buy framework on our AI services catalogue page lays out the eight dimensions we score every candidate workload against.

Can you sign a Business Associate Agreement for HIPAA workloads?

Yes. We sign a BAA before any PHI changes hands. HIPAA-covered AI workloads run inside our private cluster in Raleigh, NC, with audit logging, scoped access, encryption in transit and at rest, and review by our compliance team. We do not run HIPAA-covered workloads on public AI APIs at any stage of the engagement.

Do you handle CMMC L1, L2, and L3 environments?

Yes, all three. CMMC L1 prototypes and production workloads run inside basic safeguards aligned to FAR 52.204-21. CMMC L2 work runs inside an enclave aligned to NIST SP 800-171. CMMC L3 work operates against the higher bar set by NIST SP 800-172. Petronella is CMMC-AB Registered Provider Organization #1449, verified at cyberab.org. The whole team is CMMC-RP. We sign a CMMC-aligned engagement letter before any controlled unclassified information enters the project boundary.

What models do you deploy on the private cluster?

Open-weight models from the leading research labs and university consortia, selected per use case for the tradeoff between capability, latency, cost, and hardware fit. We deploy general-purpose reasoning models, code-specialized models, vision-language models for document workflows, speech-to-text and text-to-speech models for voice agent workloads, and embedding models for retrieval pipelines. Specific model choices vary by engagement; we publish a recommendation in the Stage-01 Discover deliverable so the client signs off on the model class before any prototype work begins.

What does an AI engagement cost?

Engagements are scoped from a discovery call. Cost depends on data state, integration complexity, regulatory frame, infrastructure path, and the specific service category. Petronella does not publish a fixed price for custom engagements because the cost surfaces and the risk surfaces are different for every regulated buyer. We do publish productized starter packages on our consumer-facing surfaces for buyers who want a fixed-scope entry point. Contact a Petronella AI engineer to scope your engagement.

How fast can a prototype be running?

For well-defined use cases, the typical path from discovery call to a working prototype on the Petronella private cluster runs three to six weeks. The first week is the Discover stage: workflow inventory, data-class classification, integration map, regulatory frame. Weeks two through six are Architect: prototype build on representative data, integrated to upstream and downstream systems, exercised at realistic load. The buyer receives a written go or no-go at the end of Architect before any production deployment work begins.

Where is the private AI cluster located?

Raleigh, North Carolina. Hardware sourced through the NVIDIA Elite Partner Channel. Built on the NVIDIA reference architecture for enterprise inference. We can also operate dedicated hardware that the client owns at the client's site or in a colocation facility the client specifies, where data residency or contract-clause requirements dictate. Petronella's headquarters is at 5540 Centerview Dr., Suite 200, Raleigh, NC 27606.

Do you work with clients outside North Carolina?

Yes. While our office and private AI cluster are in Raleigh, we serve clients throughout the Research Triangle (Durham, Chapel Hill, Cary, RTP) and across North Carolina (Charlotte, Greensboro, Wilmington, Asheville, Winston-Salem), and we engage with regulated organizations nationally, particularly in healthcare, defense, finance, engineering, and legal. The discovery call works the same way regardless of geography; site visits to the Raleigh facility are available for buyers who want to see the infrastructure before committing.

Can you train our team alongside the build?

Yes. Training and enablement is one of the seven service lines in the catalogue, and we routinely fold AI-team training into a build engagement so that the in-house team can take over operations after Stage 03. The training curriculum covers prompt design, retrieval pipeline maintenance, evaluation harness operation, model-version change-control, and the audit log review process. We also publish executive-level briefings for the leadership team that owns the AI program politically.

What is the difference between this pillar and the AI Services Catalogue page?

This pillar (/ai/) is the top-level AI hub: who we are, what categories we ship, the trust band, the methodology, the industry coverage. The AI Services Catalogue page goes deeper on the seven service lines themselves with the build-vs-buy framework, the integration consulting playbook, the automation consulting model, and the hiring-vs-partnering analysis. Most buyers read this pillar first to understand the practice, then read the services catalogue to pick the specific entry lane.

How do you compare to other AI services firms?

Three honest differences. First, we are a regulated-vertical practice: HIPAA, CMMC L1 / L2 / L3, NIST 800-171 / 800-172, GLBA, ITAR, attorney-client privilege, and contract-clause data residency are first-class constraints, not exceptions we handle case-by-case. Second, we operate the private AI infrastructure ourselves rather than reselling someone else's cloud capacity. Third, we run the full lifecycle in-house (strategy, prototype, integration, custom development, infrastructure, operations) rather than handing off between vendors. Most national AI consulting firms do one or two of those well. Very few do all three for regulated buyers in the same engagement.

05 /Service areas: AI delivery across North Carolina

Ready to Scope a Private AI Engagement?

The discovery call is the entry lane regardless of which service category fits. We listen to where you are, name the data-class and regulatory frame, and recommend the lane that actually solves the problem - including "you do not need us yet, here is what to do internally first" when that is the right answer.