Skip to content
Brocode SolutionsAI Software Development

Sovereign GenAI · 8x H100 · 90 days

Your hardware. Your keys. Your model weights. In your data centre.

A customer-owned LLM appliance — 8x H100 or H200 nodes, NVLink fabric, behind your Thales / Entrust HSM. Llama 3, Mistral, Qwen, DeepSeek, Jais — your choice, your weights. Deployed and STIG-hardened in under 90 days.

0 bytes egressed · 100 % customer-held keys · TDRA-registered · NVIDIA Elite Partner

rack · 6U
owned by customer
H100 · 00
H100 · 01
H100 · 02
H100 · 03
H100 · 04
H100 · 05
H100 · 06
H100 · 07

NVLink fabric

HSM · Thales Luna

vLLM cluster

0 bytes egressed
100 % customer-held keys
  • 0 bytes

    Data egressed across all deployments

  • 100 %

    Customer-held keys (Thales / Entrust)

  • 90 days

    From signed SoW to go-live

  • 8 → 32

    H100 nodes in the reference variants

Why the SaaS path is closed for your workload class

The line your CISO has already drawn. We respect it.

For citizen records, signals-adjacent data, customer KYC, and intelligence-adjacent corpora, the residual telemetry plane on a multi-tenant SaaS LLM is structurally non-compliant. The question is what to deploy instead.

OpenAI Enterprise / ChatGPT Enterprise

US-resident inference, US-held keys. Structurally non-compliant for the workloads in scope — there is no path to in-country residency.

Azure OpenAI UAE North

Geographically in-country, but the keys, the control plane, the telemetry plane, and the tenancy fabric are Microsoft-managed and multi-tenant. For the CISOs we sell to, this is the wrong side of the line.

AWS Bedrock / Google Vertex AI

Same structural objection as Azure plus weaker regional residency in the GCC. Ruled out for citizen / KYC / signals-adjacent workload classes.

G42 / Presight / Core42 cloud-LLM

Strong sovereign posture. Excellent for many buyers. For peer entities or buyers with a vendor-neutrality mandate, a customer-owned appliance is the requested posture — and we are the vendor-neutral alternative.

The sovereign LLM appliance

One reference, three deployment postures.

The same physical architecture, with three network zonings depending on data classification. The customer-managed boundary is identical in all three; only the perimeter moves.

Reference architecture

8x H100 SXM5 · NVLink · HSM-backed · vLLM + Triton

customer-owned boundaryHSMThales / EntrustKMSCustomer-managed8x H100 SXM5 · NVLinkvLLM + Triton + TensorRT-LLMModelsLlama · Mistral · JaisRAG storeMilvus / QdrantNetwork zoning: air-gap · DMZ · sovereign cloud-burstSTIG / CIS hardened · firmware attested · CC-on for Top-Secret tier

Operations boundary: Brocode operates the stack under SLA during deployment and warranty; customer team takes joint operations from day 15. The platform-operations pattern is documented in MLOps & AI Infrastructure.

Posture 1

Air-gapped

Zero external network. Model updates on signed media. Default for Restricted / Top-Secret-equivalent classifications.

Posture 2

DMZ-gated

Customer-controlled DMZ. Brocode operational access via customer-managed jump host with full session recording. Default for Confidential workloads.

Posture 3

Sovereign cloud-bursted

Steady-state on-appliance with overflow to in-country sovereign cloud (G42 Core42) under customer-managed keys. Default where elastic capacity matters.

Model menu

Seven open-weight models. Your choice. Your weights.

The base models we ship by default. All open-weight, all running fully customer-resident. The customer-specific choice is informed by Arabic strength, context length, throughput, and licence posture.

ModelArabic strengthContextThroughput (8x H100)Licence
Llama-3.1-70B-InstructStrong (with fine-tune)128k~210 t/s on 8x H100Meta licence — reviewed and signed by entity
Llama-3.1-405B-InstructStrong (with fine-tune)128k~38 t/s on 8x H100Meta licence — reviewed and signed by entity
Mistral-Large-2Moderate128k~160 t/s on 8x H100Mistral Research / Commercial
Mixtral-8x22B-InstructModerate64k~280 t/s on 8x H100Apache 2.0
Qwen2.5-72B-InstructStrong128k~190 t/s on 8x H100Tongyi Qianwen — commercial
DeepSeek-V3Moderate128k~120 t/s on 8x H100DeepSeek licence
Jais-30B-Chat-v3Strong (Arabic-first)8k~340 t/s on 8x H100Open — Inception / G42

Throughput on a 70B-class model at 256-token input / 256-token output, FP8 / INT8 path on H100. Jais-30B is the default Arabic-first option; see Natural Language Processing for the Arabic safety eval set.

Side-by-side

Brocode vs OpenAI / Azure / Bedrock / G42 cloud-LLM.

What each option actually delivers on the controls your CISO is signing for. Not features — controls.

CapabilityBrocode-installed applianceAzure OpenAI UAE NorthOpenAI EnterpriseAWS BedrockG42 cloud-LLM
Hardware physically owned by customer
Keys held in customer HSM (Thales / Entrust)Microsoft-managedVendor-managed
In-country residency (UAE / KSA)UAE North
Single-tenant inference (no shared fabric)Multi-tenantMulti-tenantMulti-tenantVariable
Open-weight model choice (Llama / Mistral / Qwen / DeepSeek / Jais)LimitedLimited
Air-gapped deployment option
TDRA / NESA / STIG hardening packPartial
Customer keeps weights if engagement endsConditional

The three objections from your CISO

What gets asked in the security review.

Objection 1

Data sovereignty in writing — prompts, embeddings, weights, logs, all in country, under our control.

Written residency posture document on day one of the engagement. Customer HSM holds keys, customer KMS issues envelopes, telemetry terminates inside boundary. Air-gapped posture is supported with no external network egress at all.

Objection 2

Open-weight is great until I have to maintain it. Who patches Llama?

Brocode does, as part of the support line. Quarterly bundles with regression reports against your safety eval. You decide whether to promote. We have refused two upstream releases in the last year for Arabic safety regression.

Objection 3

Capex justification — how do I defend this to the CFO vs the Azure OpenAI consumption line?

TCO crossover sits between month 14 and month 22 depending on traffic class. Below 50M tokens / month we will tell you Azure is cheaper. The actual argument is residency + key custody + multi-year cost certainty.

90-day delivery

Week-by-week. Cleared-personnel intake for federal / defence.

A delivery plan that respects security review timelines. The classification ceiling on your form dictates the cleared-personnel onboarding in weeks 1–3.

  1. Weeks 1–3

    Site survey + HSM / KMS / network design

    Residency posture document signed by customer legal. HSM / KMS design agreed with your security team. Network zoning (air-gap, DMZ, or cloud-burst) confirmed. For federal / defence: cleared-personnel onboarding.

    Residency posture signed

  2. Weeks 4–8

    Hardware delivery + STIG hardening

    NVIDIA H100 / H200 nodes delivered, racked, and integrated. NVLink fabric verified. STIG / CIS hardening applied to every host. Firmware attestation chain documented. Optional Confidential Computing (CC-on) for Top-Secret tiers.

    STIG / CIS hardening pass

  3. Weeks 9–11

    Model deployment + RAG + safety eval

    vLLM + Triton + TensorRT-LLM stack stood up. Base models loaded (Llama / Mistral / Qwen / DeepSeek / Jais — customer choice). RAG corpus ingested via Brocode Arabic OCR + Unstructured.io. Llama Guard 3 + NeMo Guardrails safety eval gates configured.

    Safety eval gates green

  4. Weeks 12–13

    PenTest + customer acceptance

    Third-party PenTest. Customer acceptance test against your written criteria. Operational runbooks reviewed by your SRE lead. Compliance pack — TDRA, NESA, CIS, FIPS 140-3 mapping — handed to your audit team.

    PenTest pass · customer countersign

  5. Week 14

    Go-live + handover

    Production cutover. 24-hour Brocode-led on-call. Customer team takes joint operations from day 15. Full operational handover — runbooks, dashboards, retraining cadence — countersigned by your platform lead.

    Customer team operating · day 15

Anonymised references

Three live appliances. Customer details under NDA only.

UAE federal entity — air-gapped

32x H100 sovereign appliance, fully air-gapped. Llama-3.1-70B + Jais-30B in production for 1,400 internal users. Zero data egress, full TDRA review passed. RAG over the entity's internal Arabic correspondence corpus, all in-appliance.

1,400 users · 0 egress

GCC tier-1 bank — HSM-backed

16x H100 appliance behind Thales Luna HSM, customer-owned KMS. RAG over 4.2 million internal documents (Arabic + English). Used for internal copilot + KYC summarisation under CBUAE / SAMA model-risk review.

4.2M docs · CBUAE-approved

Defence-adjacent prime

8x H100 air-gapped node, Mistral-Large-2 fine-tuned on the entity's technical corpus. Full STIG / CIS hardening, third-party PenTest pass. Cleared-personnel delivery from week 1. Operational by week 13.

STIG pass · cleared delivery

Customer logos remain anonymised indefinitely on this page. Case-study deltas reviewable under NDA via Government & Public Sector or Banking & Financial Services.

Free download

The Sovereign LLM Reference Architecture & 36-Month TCO Pack

A 48-page technical pack covering the full 8x H100 reference architecture, the Excel TCO model, the hardware bill of materials, and the STIG hardening checklist.

  • Hardware BoM — 8x H100 SXM5 reference node (and 16x, 32x variants)
  • Power and cooling sizing for UAE / KSA data centres
  • vLLM + Triton + TensorRT-LLM serving stack — config and tuning
  • 36-month TCO model vs Azure OpenAI UAE North and OpenAI Enterprise
  • Customer HSM / KMS patterns — Thales Luna and Entrust nShield
PDF

The Sovereign LLM Reference Architecture & 36-Month TCO Pack

Instant download. No spam. Unsubscribe any time.

FAQ

What CISOs and procurement leads ask first.

Eight questions our principal infrastructure architect answers in nearly every confidential session. No marketing softening.

  • Three structural controls. First, the appliance is physically inside your data centre or your sovereign cloud tenancy — there is no Brocode-managed control plane reaching in. Second, all telemetry, logs, and metrics terminate inside your boundary; no SaaS observability vendor sees your traffic. Third, on air-gapped deployments there is no external network egress at all — model updates are delivered on signed media and verified against the customer-held attestation key. We provide a written residency posture document on day one of the engagement that legal can sign.

Confidential design session

Sixty minutes with our principal infrastructure architect — under NDA.

Seven fields — classification, concurrent users, token budget, deployment posture, key custody, regulator, base model preference. Submissions at Restricted or above route to the cleared intake.

Or skip the form.

Message our principal infrastructure architect directly on WhatsApp, under NDA.

Message on WhatsApp

Quote request

Request a sovereign LLM appliance design session

Confidential 60-minute session with our principal infrastructure architect, under NDA. Cleared-personnel intake available for federal / defence engagements.

Prefer chat? Message us on WhatsApp — we'll see it within working hours.

Request design sessionWhatsApp