The AI Readiness Assessment for Regulated and Defense Workloads
Most "AI readiness" deliverables are slide decks. Petronella Technology Group ships a 7-dimension diagnostic that ends with a private LLM pilot architecture, a CMMC and HIPAA boundary map, and a 30-day plan you can execute. Built for CIOs and CTOs whose risk profile rules out public ChatGPT and public Claude for sensitive data.
"Are we AI-ready?" is the wrong question. The right question has seven answers.
Every CIO we meet has the same problem. The board wants an AI strategy. The CFO wants ROI. The CISO wants to know whether typing controlled unclassified information into a public model creates a reportable incident under DFARS 252.204-7012. The compliance officer wants a documented basis under HIPAA 164.308. And the engineering team has already been quietly using a public assistant for code review because nobody told them to stop.
"AI readiness" is not a yes or no. It is a 7-dimension scorecard. The dimensions interact. You can have world-class data quality and still fail readiness if your governance has no model-use policy. You can have a deeply skilled team and still fail if your network has no boundary for AI traffic. You can buy the GPUs and still fail if change management has not been briefed on prompt-injection risk in the new agentic workflows.
The Petronella Technology Group AI Readiness Assessment is a fixed-scope, fixed-deliverable engagement that scores your environment across all seven dimensions, identifies the two or three gaps that actually block a regulated pilot, and ends with an architecture you can implement in 30 days. The assessment is a separate SKU from our private AI buildouts on purpose. We do not want the diagnostic to be a sales pitch. We want the diagnostic to be honest.
This page explains the 7-dimension framework, the regulatory traps that disqualify public LLMs from CMMC and HIPAA workloads, what an "AI-ready" private LLM stack actually looks like, and how the 30-day path from assessment to pilot is structured. If you finish reading and want a written proposal scoped to your environment, request it from the buttons above.
The 7-Dimension AI Readiness Score: A Diagnostic Framework
Each dimension carries equal weight in the final score. Each is graded 0 to 4, where 0 means "no plan exists" and 4 means "production-grade and audited." A passing total to begin a regulated AI pilot is 18 or above with no individual dimension below 2. Below that line, the assessment becomes a remediation plan and we delay the pilot until the lowest two dimensions are lifted.
Data
Is your training and retrieval corpus inventoried, classified by sensitivity, and clean enough that a model trained on it will not learn embarrassing or non-compliant content?
Governance
Do you have a written AI use policy, an approved-model list, a denied-model list, and a designated owner for AI risk decisions at the executive level?
Security
Is the network boundary for AI traffic defined and monitored? Are prompts and outputs logged in a way that survives an incident response request?
Privacy
Are PHI, PII, and CUI flows mapped so you can prove a public LLM never sees them? Are subject-access and right-to-delete obligations honored by your AI stack?
Talent
Do you have at least one engineer who can deploy and operate an inference server, and at least one compliance role who can read an AI risk register?
Infrastructure
Is there a path to GPU capacity, network egress controls, an MCP allowlist, and the storage tiers required for vector indexes and audit logs?
Change Management
Has the workforce been briefed on AI usage rules, trained on prompt-injection awareness, and given a sanctioned alternative to whatever they were using unofficially?
Data - self-assessment questions
- Have you produced a written data inventory in the last 12 months that maps every dataset to a sensitivity tier?
- Are CUI, PHI, PII, and ITAR-controlled data tagged at the file or record level, not just at the system level?
- Do you have a procedure to redact or tokenize sensitive fields before any data leaves the boundary for AI processing?
- Is at least one retrieval corpus already structured for RAG (chunked, embedded, refresh cadence defined)?
- Has a data steward signed off on what is allowed to enter the training or fine-tuning pipeline?
Governance - self-assessment questions
- Does your acceptable-use policy explicitly name public LLMs and state where they may and may not be used?
- Is there a designated AI risk owner at the C-suite or VP level with documented authority?
- Have you adopted a published framework (NIST AI RMF, ISO 42001, or equivalent) as the basis for AI governance?
- Do you have a model-approval gate that requires a security and privacy review before any new model goes into a workflow?
- Is there a board-visible quarterly report on AI risk and use?
Security - self-assessment questions
- Is outbound traffic to public LLM APIs (api.openai.com, api.anthropic.com, generativelanguage.googleapis.com) explicitly allowed or blocked at the egress firewall?
- Are all prompts and completions logged with user attribution and retained at least 12 months?
- Have you tested for prompt injection on every agent that retrieves untrusted content (email, web, support tickets)?
- Is the MCP (Model Context Protocol) tool allowlist enforced at the gateway, not just inside the agent code?
Privacy, Talent, Infrastructure, Change Management - self-assessment questions
- Privacy: Have you mapped every PHI, PII, and CUI flow to a model and confirmed that flow stays inside the boundary?
- Talent: Can at least one internal engineer stand up a vLLM or Ollama server without external help inside one business day?
- Infrastructure: Do you have GPU capacity, vector storage, and audit log storage committed in writing for at least 12 months?
- Change Management: Has every workforce member received written guidance on which AI tools are sanctioned, with a sanctioned alternative for each banned tool?
The scorecard is the spine of the engagement. Every other deliverable hangs on it. A passing score is not the goal. An honest score is.
Why Public LLMs Disqualify You from CMMC, HIPAA, and Defense Workloads
This section is the most useful single conversation Petronella Technology Group has with prospects. It is also the most resisted. The temptation to keep using ChatGPT, Claude.ai, Microsoft Copilot consumer, or Gemini for everyday tasks is enormous because those tools are excellent. The problem is not the quality. The problem is the boundary.
The data exfiltration vector that the legal team always misses
When a user pastes a customer record, a contract draft, or a CMMC SSP excerpt into a public chatbot, the data leaves the corporate network and is processed on infrastructure outside your control. The vendor terms of service may or may not allow training on the data. The vendor's retention window may or may not match your records retention requirement. The vendor's subprocessor list may or may not include a country that triggers a separate regulatory regime. None of those questions matter for a marketing brainstorm. All of them matter for CUI.
NIST SP 800-171 Rev 2 control 3.13.1 Boundary Protection requires you to monitor, control, and protect organizational communications at external boundaries. A user typing CUI into a public LLM endpoint is, by definition, transferring CUI across an external boundary to a system that is not part of your authorized environment. There is no enterprise feature toggle that fixes this for the consumer products. The fix is architectural.
OCR and HIPAA: the precedent that should worry every healthcare CIO
The Office for Civil Rights has historically treated unauthorized PHI disclosure to a service that is not a HIPAA Business Associate as a reportable breach. The arrival of generative AI does not change that posture. If a clinician pastes a patient note into a public LLM and that vendor has not signed a HIPAA Business Associate Agreement, the act of pasting is the disclosure. The vendor's later non-retention is not a defense. The disclosure has already happened.
Some vendors now offer enterprise tiers with BAA execution. Those tiers reduce the legal exposure but do not eliminate the architectural question. The data still leaves the building. The audit trail still depends on the vendor's logging. The provenance of a generated answer still cannot be traced to specific training records. A regulator asking "what model produced this clinical recommendation, and what data was it trained on" has only one fully defensible answer: a model you control on infrastructure you own.
Acceptable public-LLM use, said plainly
Petronella Technology Group is not anti-public-LLM. We use public APIs every day for non-sensitive workloads where the productivity gain is real and the data is not regulated. The honest dividing line:
| Workload | Public LLM acceptable? | Why |
|---|---|---|
| Marketing draft, public blog ideation | Yes | Public information, no regulated content. |
| Public-facing FAQ writing | Yes | No CUI, no PHI, no PII. |
| Code review on open-source repos | Yes | Code is already public. |
| Code review on proprietary repos | Conditional | Only with enterprise contract, no-training guarantee, and IP indemnity. |
| Customer support drafts with names removed | Conditional | Only after verified de-identification. |
| Any CUI, ITAR, or DFARS 7012 data | No | 3.13.1 boundary violation. Use a private LLM inside the boundary. |
| Any PHI | No | HIPAA disclosure without BAA. Use a private LLM with a BAA-covered architecture. |
| Attorney work product, privileged communication | No | Risk of waiver. Use a private model with privilege controls. |
| Financial data subject to GLBA or SOX | No | Disclosure to unauthorized processor. |
The AI Readiness Assessment includes a workforce-level survey that produces this same table customized to your environment, plus a workforce communication plan to retire the unsanctioned tools without losing the productivity they delivered. The replacement is not "no AI." The replacement is a private LLM stack tuned for the workloads that matter.
Private LLM Infrastructure: What "AI-Ready" Looks Like for Regulated Industries
The phrase "private AI" is overloaded. Some vendors mean "shared cloud tenant labeled private." Some mean "dedicated VPC inside the same hyperscaler that runs the public model." Petronella Technology Group means the data and the inference both occur on hardware you authorize, under controls you can show to a C3PAO or to OCR.
The reference architecture
The Petronella Technology Group reference private LLM stack has six layers, each chosen to satisfy a specific control requirement:
- Hardware tier. On-prem GPU clusters sourced through the NVIDIA Elite Partner Channel, sized to the model class (typically NVIDIA H100, H200, or L40S for production; consumer-class GPUs for development). Clusters live in your data center, our Raleigh facility, or a colocation cage you control.
- Inference layer. vLLM for high-throughput batched serving, Ollama for low-friction single-tenant workstation deployment, llama.cpp for edge devices. Selection driven by latency, concurrency, and quantization requirements.
- Model layer. Open-weight models (Llama 3.1, Qwen, Mistral, DeepSeek, Mixtral, or fine-tuned derivatives) chosen to fit the workload. No call to a third-party API for the regulated path.
- Boundary layer. Egress firewall rules that block the public LLM endpoints by default, with audited exceptions for non-regulated workloads. MCP gateway with a strict tool allowlist. All inbound and outbound prompts pass through a logging proxy.
- Retrieval layer. Vector store (pgvector, Qdrant, or Weaviate) inside the boundary, populated from sources that have been classified and approved. Re-embedding cadence defined per source.
- Defense layer. Prompt-injection detection on every untrusted retrieval (email content, web scrapes, support tickets). Output filtering for PII leakage. Rate limiting and anomaly detection on per-user prompt volume.
Edge inference for distributed workforces
Not every workload belongs in the data center. Field offices, mobile clinicians, and engineering staff working from sites without high-bandwidth back-to-base often need inference at the edge. The Petronella Technology Group pattern for edge inference uses ruggedized small-form-factor servers with consumer or workstation-class GPUs, an Ollama or llama.cpp runtime, and a synchronized policy and retrieval bundle pushed nightly from the central cluster. Prompts and completions are buffered and synchronized when connectivity returns. Logs survive intermittent connectivity.
MCP allowlist and the agentic workflow problem
The Model Context Protocol lets a language model call external tools. That capability is the reason agentic workflows exist. It is also the largest new attack surface in the AI stack. A model that can read email can be told by email content to send email. A model that can read tickets can be told by ticket content to exfiltrate database records. The Petronella Technology Group default posture is allowlist-only at the MCP gateway, with every tool requiring explicit grant per agent, per user, and per data class. The assessment phase audits any existing agent for over-broad tool grants and produces a remediation plan.
Prompt-injection defense, said in plain language
Prompt injection is the attack where an attacker hides instructions inside content the model will read - a footer in an email, a comment in a code file, a row in a spreadsheet. The model, having no concept of "trusted instruction" versus "attacker-supplied content," follows whichever it processed last. Defense is layered: separate the system prompt cryptographically, sanitize retrieval inputs, restrict tool grants to least privilege, monitor for output anomalies, and require human approval for any agent action that mutates external state. The assessment scores your existing exposure and the architecture phase bakes the defenses into the pilot.
30-Day Path from Assessment to Private LLM Pilot
The assessment is week one. The remainder is execution. The pilot at day 30 is a working private LLM running a single high-value workflow inside your boundary, with logs, an evidence pack, and a path to expand.
Discovery
Stakeholder interviews, current-state data inventory, workforce survey on existing AI usage, regulatory scope review, network diagram for AI traffic. Output: 7-dimension score with evidence.
Gap Analysis
Score-to-target gap matrix, ranked remediation actions, public-LLM phaseout plan with sanctioned replacement per workload, regulatory exposure narrative.
Architecture
Reference architecture customized to your environment, model selection, hardware sizing, boundary firewall rule set, MCP allowlist, logging design, evidence-collection plan.
Pilot
Stand up the private LLM stack on lab or first-production hardware, deploy the chosen workload (typical first picks: internal RAG search, secure code review, drafting under privilege), demonstrate end-to-end with logging.
What you walk away with at day 30
- A signed AI Readiness Score with per-dimension evidence and remediation actions.
- A boundary diagram that shows exactly where regulated data flows and stops.
- A working private LLM inference endpoint on hardware you control, serving at least one workflow.
- A logging design that survives an incident response, an OCR audit, or a CMMC assessment objective.
- An evidence pack mapped to NIST 800-171 controls, HIPAA Security Rule sections, and ISO 42001 clauses as applicable.
- A workforce communication and training plan for the sanctioned AI tools.
- A 90-day expansion roadmap for the next two workloads.
The 30-day window assumes timely access to your network engineering and compliance contacts. Larger organizations with multiple business units typically extend Week 3 by one to two weeks. The assessment SKU and the pilot SKU are priced and contracted separately so you can stop at any phase boundary.
Real Credentials, Real Engagements
Petronella Technology Group has run cybersecurity, compliance, and now AI engagements for regulated organizations since 2002. The team is small on purpose. Every engagement has a named principal. We do not pad bid teams with junior staff and bill them as senior. The credentials below are real, verifiable, and current.
You will not find fabricated case studies on this page. We deliver real work for real clients, but we do not publish their names without written permission. On a discovery call we are happy to talk through anonymized engagement patterns relevant to your industry, including healthcare, defense industrial base, professional services, and engineering firms.
FAQ: AI Readiness Assessment
How long does the AI Readiness Assessment take?
Do I have to commit to a private LLM buildout to start the assessment?
What if our score is too low to start a pilot?
Can the assessment cover both CMMC and HIPAA scopes?
Do you assess organizations with no existing AI usage?
How do you handle data we share during discovery?
Ready to score your AI readiness honestly?
Two-week assessment, 7-dimension scorecard, gap analysis, reference architecture, and a 30-day path to a private LLM pilot you can defend in a regulator's office. Custom-quoted to your environment.