Sovereign GenAI · 8x H100 · 90 days
Your hardware. Your keys. Your model weights. In your data centre.
A customer-owned LLM appliance — 8x H100 or H200 nodes, NVLink fabric, behind your Thales / Entrust HSM. Llama 3, Mistral, Qwen, DeepSeek, Jais — your choice, your weights. Deployed and STIG-hardened in under 90 days.
0 bytes egressed · 100 % customer-held keys · TDRA-registered · NVIDIA Elite Partner
NVLink fabric
HSM · Thales Luna
vLLM cluster
0 bytes
Data egressed across all deployments
100 %
Customer-held keys (Thales / Entrust)
90 days
From signed SoW to go-live
8 → 32
H100 nodes in the reference variants
Why the SaaS path is closed for your workload class
The line your CISO has already drawn. We respect it.
For citizen records, signals-adjacent data, customer KYC, and intelligence-adjacent corpora, the residual telemetry plane on a multi-tenant SaaS LLM is structurally non-compliant. The question is what to deploy instead.
OpenAI Enterprise / ChatGPT Enterprise
US-resident inference, US-held keys. Structurally non-compliant for the workloads in scope — there is no path to in-country residency.
Azure OpenAI UAE North
Geographically in-country, but the keys, the control plane, the telemetry plane, and the tenancy fabric are Microsoft-managed and multi-tenant. For the CISOs we sell to, this is the wrong side of the line.
AWS Bedrock / Google Vertex AI
Same structural objection as Azure plus weaker regional residency in the GCC. Ruled out for citizen / KYC / signals-adjacent workload classes.
G42 / Presight / Core42 cloud-LLM
Strong sovereign posture. Excellent for many buyers. For peer entities or buyers with a vendor-neutrality mandate, a customer-owned appliance is the requested posture — and we are the vendor-neutral alternative.
The sovereign LLM appliance
One reference, three deployment postures.
The same physical architecture, with three network zonings depending on data classification. The customer-managed boundary is identical in all three; only the perimeter moves.
Reference architecture
8x H100 SXM5 · NVLink · HSM-backed · vLLM + Triton
Operations boundary: Brocode operates the stack under SLA during deployment and warranty; customer team takes joint operations from day 15. The platform-operations pattern is documented in MLOps & AI Infrastructure.
Posture 1
Air-gapped
Zero external network. Model updates on signed media. Default for Restricted / Top-Secret-equivalent classifications.
Posture 2
DMZ-gated
Customer-controlled DMZ. Brocode operational access via customer-managed jump host with full session recording. Default for Confidential workloads.
Posture 3
Sovereign cloud-bursted
Steady-state on-appliance with overflow to in-country sovereign cloud (G42 Core42) under customer-managed keys. Default where elastic capacity matters.
Model menu
Seven open-weight models. Your choice. Your weights.
The base models we ship by default. All open-weight, all running fully customer-resident. The customer-specific choice is informed by Arabic strength, context length, throughput, and licence posture.
| Model | Arabic strength | Context | Throughput (8x H100) | Licence |
|---|---|---|---|---|
| Llama-3.1-70B-Instruct | Strong (with fine-tune) | 128k | ~210 t/s on 8x H100 | Meta licence — reviewed and signed by entity |
| Llama-3.1-405B-Instruct | Strong (with fine-tune) | 128k | ~38 t/s on 8x H100 | Meta licence — reviewed and signed by entity |
| Mistral-Large-2 | Moderate | 128k | ~160 t/s on 8x H100 | Mistral Research / Commercial |
| Mixtral-8x22B-Instruct | Moderate | 64k | ~280 t/s on 8x H100 | Apache 2.0 |
| Qwen2.5-72B-Instruct | Strong | 128k | ~190 t/s on 8x H100 | Tongyi Qianwen — commercial |
| DeepSeek-V3 | Moderate | 128k | ~120 t/s on 8x H100 | DeepSeek licence |
| Jais-30B-Chat-v3 | Strong (Arabic-first) | 8k | ~340 t/s on 8x H100 | Open — Inception / G42 |
Throughput on a 70B-class model at 256-token input / 256-token output, FP8 / INT8 path on H100. Jais-30B is the default Arabic-first option; see Natural Language Processing for the Arabic safety eval set.
Side-by-side
Brocode vs OpenAI / Azure / Bedrock / G42 cloud-LLM.
What each option actually delivers on the controls your CISO is signing for. Not features — controls.
| Capability | Brocode-installed appliance | Azure OpenAI UAE North | OpenAI Enterprise | AWS Bedrock | G42 cloud-LLM |
|---|---|---|---|---|---|
| Hardware physically owned by customer | |||||
| Keys held in customer HSM (Thales / Entrust) | Microsoft-managed | Vendor-managed | |||
| In-country residency (UAE / KSA) | UAE North | ||||
| Single-tenant inference (no shared fabric) | Multi-tenant | Multi-tenant | Multi-tenant | Variable | |
| Open-weight model choice (Llama / Mistral / Qwen / DeepSeek / Jais) | Limited | Limited | |||
| Air-gapped deployment option | |||||
| TDRA / NESA / STIG hardening pack | Partial | ||||
| Customer keeps weights if engagement ends | Conditional |
The three objections from your CISO
What gets asked in the security review.
Objection 1
Data sovereignty in writing — prompts, embeddings, weights, logs, all in country, under our control.
Written residency posture document on day one of the engagement. Customer HSM holds keys, customer KMS issues envelopes, telemetry terminates inside boundary. Air-gapped posture is supported with no external network egress at all.
Objection 2
Open-weight is great until I have to maintain it. Who patches Llama?
Brocode does, as part of the support line. Quarterly bundles with regression reports against your safety eval. You decide whether to promote. We have refused two upstream releases in the last year for Arabic safety regression.
Objection 3
Capex justification — how do I defend this to the CFO vs the Azure OpenAI consumption line?
TCO crossover sits between month 14 and month 22 depending on traffic class. Below 50M tokens / month we will tell you Azure is cheaper. The actual argument is residency + key custody + multi-year cost certainty.
90-day delivery
Week-by-week. Cleared-personnel intake for federal / defence.
A delivery plan that respects security review timelines. The classification ceiling on your form dictates the cleared-personnel onboarding in weeks 1–3.
Weeks 1–3
Site survey + HSM / KMS / network design
Residency posture document signed by customer legal. HSM / KMS design agreed with your security team. Network zoning (air-gap, DMZ, or cloud-burst) confirmed. For federal / defence: cleared-personnel onboarding.
Residency posture signed
Weeks 4–8
Hardware delivery + STIG hardening
NVIDIA H100 / H200 nodes delivered, racked, and integrated. NVLink fabric verified. STIG / CIS hardening applied to every host. Firmware attestation chain documented. Optional Confidential Computing (CC-on) for Top-Secret tiers.
STIG / CIS hardening pass
Weeks 9–11
Model deployment + RAG + safety eval
vLLM + Triton + TensorRT-LLM stack stood up. Base models loaded (Llama / Mistral / Qwen / DeepSeek / Jais — customer choice). RAG corpus ingested via Brocode Arabic OCR + Unstructured.io. Llama Guard 3 + NeMo Guardrails safety eval gates configured.
Safety eval gates green
Weeks 12–13
PenTest + customer acceptance
Third-party PenTest. Customer acceptance test against your written criteria. Operational runbooks reviewed by your SRE lead. Compliance pack — TDRA, NESA, CIS, FIPS 140-3 mapping — handed to your audit team.
PenTest pass · customer countersign
Week 14
Go-live + handover
Production cutover. 24-hour Brocode-led on-call. Customer team takes joint operations from day 15. Full operational handover — runbooks, dashboards, retraining cadence — countersigned by your platform lead.
Customer team operating · day 15
Anonymised references
Three live appliances. Customer details under NDA only.
UAE federal entity — air-gapped
32x H100 sovereign appliance, fully air-gapped. Llama-3.1-70B + Jais-30B in production for 1,400 internal users. Zero data egress, full TDRA review passed. RAG over the entity's internal Arabic correspondence corpus, all in-appliance.
1,400 users · 0 egress
GCC tier-1 bank — HSM-backed
16x H100 appliance behind Thales Luna HSM, customer-owned KMS. RAG over 4.2 million internal documents (Arabic + English). Used for internal copilot + KYC summarisation under CBUAE / SAMA model-risk review.
4.2M docs · CBUAE-approved
Defence-adjacent prime
8x H100 air-gapped node, Mistral-Large-2 fine-tuned on the entity's technical corpus. Full STIG / CIS hardening, third-party PenTest pass. Cleared-personnel delivery from week 1. Operational by week 13.
STIG pass · cleared delivery
Customer logos remain anonymised indefinitely on this page. Case-study deltas reviewable under NDA via Government & Public Sector or Banking & Financial Services.
Free download
The Sovereign LLM Reference Architecture & 36-Month TCO Pack
A 48-page technical pack covering the full 8x H100 reference architecture, the Excel TCO model, the hardware bill of materials, and the STIG hardening checklist.
- Hardware BoM — 8x H100 SXM5 reference node (and 16x, 32x variants)
- Power and cooling sizing for UAE / KSA data centres
- vLLM + Triton + TensorRT-LLM serving stack — config and tuning
- 36-month TCO model vs Azure OpenAI UAE North and OpenAI Enterprise
- Customer HSM / KMS patterns — Thales Luna and Entrust nShield
FAQ
What CISOs and procurement leads ask first.
Eight questions our principal infrastructure architect answers in nearly every confidential session. No marketing softening.
Three structural controls. First, the appliance is physically inside your data centre or your sovereign cloud tenancy — there is no Brocode-managed control plane reaching in. Second, all telemetry, logs, and metrics terminate inside your boundary; no SaaS observability vendor sees your traffic. Third, on air-gapped deployments there is no external network egress at all — model updates are delivered on signed media and verified against the customer-held attestation key. We provide a written residency posture document on day one of the engagement that legal can sign.
Confidential design session
Sixty minutes with our principal infrastructure architect — under NDA.
Seven fields — classification, concurrent users, token budget, deployment posture, key custody, regulator, base model preference. Submissions at Restricted or above route to the cleared intake.
Or skip the form.
Message our principal infrastructure architect directly on WhatsApp, under NDA.
Message on WhatsAppContinue exploring
Related capabilities and stories
MLOps & AI Infrastructure
Registry, retraining, drift, and lineage for the appliance estate.
Read moreDocument Intelligence
Arabic OCR ingest feeding the appliance RAG layer.
Read moreNatural Language Processing
Arabic safety eval and intent classification on top of the LLM.
Read moreGovernment & Public Sector
Sovereign GenAI deployments for federal entities.
Read moreBanking & Financial Services
Tier-1 bank copilot and KYC summarisation under CBUAE / SAMA.
Read more