Failure 01
Data residency
Sovereign deployment patterns at Khazna, G42, Mobily, or on-prem.
Enterprise GenAI taskforce
Sovereign infrastructure. Your data. Your guardrails. Three boardable use cases live in 120 days under a fixed-fee delivery model. Built for the GenAI taskforce that has shipped 18 prototypes and zero production wins.
$ query rag.govregs --lang ar
- retrieving from cbuae-circulars (4,238 docs)
- bge-m3 + bm25 hybrid, k=12, rrf
- reranking via cohere-rerank-v3
> ما هي متطلبات الإبلاغ عن المخاطر التشغيلية؟
- 3 cited sources resolved -
CBUAE-OP-RISK-2024-03
para 4.2.1
CBUAE-OP-RISK-2024-03
para 4.3.7
ADGM-FSRA-PRU-7
para 11.4
RAG over Arabic regulatory circulars - cited sources resolved in real time.
74%
GCC GenAI pilots that stall in UAT - benchmark across 23 pilots
87%
First-contact resolution lift - UAE bank back-office RAG
14,000
Employees on the federal sovereign LLM gateway
18
FTE-equivalent saved - KSA conglomerate finance copilot
Why most GenAI pilots stall in UAT
Each one is named in the lead-magnet report with the corresponding counter and the typical owner inside a taskforce.
Failure 01
Sovereign deployment patterns at Khazna, G42, Mobily, or on-prem.
Failure 02
Eval harness with domain golden sets, Giskard / DeepEval in CI.
Failure 03
Adapters to core banking, ERP, and ITSM stacks (SAP, Oracle, T24).
Failure 04
Documented validation per use case, signed before promotion.
Failure 05
Risk-committee evidence pack aligned to NIST AI RMF and UAE AI Charter.
Failure 06
Frontline adoption tracking and progressive rollout protocol.
Failure 07
TCO model per use case, model-choice strategy with crossover thresholds.
The 12-Week Production Path
The default operating rhythm. The strategy houses run open-ended advisory; we run a contract with named exits.
Weeks 1-4
Principal-to-principal scoping with the board GenAI committee. We agree the three use cases, the sovereignty model, and the governance posture. Output: a signed design book and risk-committee pre-read.
Gate G0 - design book approved by sponsor
Weeks 5-8
Pod builds the retrieval, generation, and orchestration stack inside the sovereign perimeter. Evaluation harnesses, guardrails, and red-team tests run alongside feature development.
Gate G1 - eval baseline beats internal hurdle rate
Weeks 9-12
Risk committee evidence pack assembled: red-team results, hallucination dashboards, model-deprecation exit plan, NIST AI RMF and UAE AI Charter alignment. Internal audit walkthrough before sign-off.
Gate G2 - risk-committee sign-off in writing
Week 13 onward
Three use cases live behind the firewall. Run-phase SLA covers eval drift, model deprecation triggers, and a defined principal contact for any board-level question.
Three boardable use cases live; quarterly board update template attached
Reference architecture
The same architecture has shipped at a tier-1 UAE bank, a federal entity, and a KSA conglomerate. The model choice changes per use case; the architecture does not.
Retrieval
bge-m3 and Cohere Embed v4 for multilingual (Arabic + English) embedding. Hybrid BM25 + dense with reciprocal-rank fusion. Arabic-tuned chunking for legal and regulatory corpora.
Generation
Self-hosted. Plus Claude Sonnet 4.5 / GPT-5 via Azure OpenAI UAE North or Bedrock Bahrain for hyperscaler-resident flows.
Gateway
One gateway. Model abstraction. Tenant isolation. Rate-limiting. Cost reporting per use case.
Safety
Prompt-injection detection, dialect-aware safety, PII redaction with Microsoft Presidio + Emirati ID detectors.
Evaluation
Domain golden sets per use case. Regression suite. Documented hallucination rates.
Sovereign deployment
Customer-managed keys, air-gapped retrieval. Alternative ADGM / DIFC zone patterns available.
Use-case catalogue
The catalogue we walk through in the principal review. Each has a documented outcome benchmark drawn from real engagements.
Internal staff find policy answers with citations; FCR lift typical 60-90%.
Public-facing assistant with cited sources and refusal handling. WhatsApp and web.
Shared services productivity - FTE-equivalent savings 10-20 typical.
Clause extraction and flagging against standard templates - federal procurement reference.
Regulatory horizon scanning + obligation mapping with cited circular paragraphs.
Self-hosted code assistant for restricted-source codebases - no code leaves the perimeter.
Khaleeji-aware draft suggestions with the bilingual NLP stack underneath.
Hybrid retrieval over a curated KG for complex multi-hop questions.
Sovereign deployment options
UAE-resident Tier IV, multi-tenant or dedicated. Default for federal-entity workloads.
Sovereign cloud with H100-class GPU bare-metal tenancies. Khaleeji-relevant ecosystem partner.
For KSA-resident workloads under SAMA scope. Customer-managed keys default.
For hyperscaler-resident workloads where UAE region is acceptable. Azure OpenAI tenancy in-region.
For Bedrock-resident workloads where AWS UAE / KSA / Bahrain coverage is acceptable.
For workloads that cannot leave the client perimeter. K8s on bare metal, customer-managed everything.
What sponsors push back on
Objection 01
“OpenAI Enterprise already has our data in a private tenant. Why should we duplicate infrastructure with you?”
Because a tenant is not a capability. The OpenAI Enterprise tenant is a model endpoint - it does not give you a retrieval layer, an eval pipeline, a Khaleeji safety classifier, a model-choice strategy, or a sovereign-deployment posture. Brocode delivers the capability layer that wraps that tenant (and any other model provider) and turns it into something your risk committee will sign. Many of our clients run both: OpenAI for global English workloads, the Brocode stack for sovereign and Arabic workloads.
Proof: anonymised tier-1 UAE bank reference - an internal RAG assistant over 4.2 million policy and product documents, sitting in front of Azure OpenAI UAE North for the English flows and a self-hosted Llama 3.3 70B for the Arabic flows. 87% first-contact resolution lift in the corporate-banking back-office.
Objection 02
“The Big-3 consultancies will give us a CxO-flavoured roadmap. Can a regional engineering firm actually own the build through to production?”
Yes - and we will commit to it on a fixed-fee, fixed-scope contract that the strategy houses will not. The 12-Week Production Path is the same methodology we have run for tier-1 GCC banks, federal entities, and KSA conglomerates. Engineering depth shows up in the eval harness, the red-team test pack, the model-choice abstraction layer, and the named senior engineers who are on the contract and the standup. The strategy deck arrives as a by-product of the build, not as the deliverable.
Proof: anonymised KSA conglomerate reference - a finance-and-procurement copilot saving 18 FTE-equivalent across shared services within seven months, with the original 12-week build paid as a fixed fee and the run-phase SLA on a separate per-quarter pricing band.
Objection 03
“Our risk committee will not approve any deployment without documented red-team results, hallucination rates per use case, and an exit strategy if the underlying model is deprecated.”
All three are in the standard governance pack. Red-team results follow a documented adversarial test plan (prompt injection, jailbreak, Khaleeji and English safety classifiers). Hallucination rates per use case are measured on a domain-specific golden set using Giskard and DeepEval in CI, refreshed monthly. The model-deprecation exit strategy is the model-choice abstraction layer: the application code does not depend on a specific model provider, so any provider can be swapped on a documented playbook with no application-layer rewrite.
Proof: anonymised federal entity reference - a sovereign LLM gateway serving 14,000 employees, fully on Khazna, with a board-approved governance pack mapped to NIST AI RMF and the UAE AI Charter. Two model swaps (one base model deprecated, one upgraded) executed inside the run-phase SLA with zero application rewrite.

Case studies
Tier-1 UAE bank
Internal RAG assistant over 4.2 million policy and product documents. 87% first-contact resolution lift in the corporate-banking back-office.
Federal entity
Sovereign LLM gateway serving 14,000 employees, fully on Khazna, with a board-approved governance pack mapped to NIST AI RMF and the UAE AI Charter.
KSA conglomerate
Finance-and-procurement copilot saving 18 FTE-equivalent across shared services within seven months. Fixed-fee build, run-phase SLA on quarterly pricing.
How we compare
Three honestly different shapes. Many enterprises run all three in parallel. Brocode is the build-through-to-production middle layer.
| Capability | Brocode | OpenAI / Microsoft Copilot tenant | McKinsey QuantumBlack / BCG X | Offshore integration shop |
|---|---|---|---|---|
| Deliverable shape Three honestly different shapes. | Working capability, fixed fee, 12 weeks | Tenant access | Strategy roadmap + advisory burn | Integration glue |
| Sovereign / on-prem deployment | Khazna, G42, Mobily, ADGM, DIFC patterns | Microsoft / OpenAI tenancy | Cloud-agnostic but offshore-billed | Hyperscaler typical |
| Risk-committee evidence pack | Red-team, hallucination, exit, NIST AI RMF / UAE AI Charter | Provider documentation only | Available, charged separately | |
| Khaleeji + English safety classifier | Brocode fine-tune + Llama Guard 3 | |||
| Model-choice abstraction (swap providers) | LiteLLM + Brocode policy plane | One provider | Cloud-bound | Provider lock-in typical |
| Eval harness in CI | Giskard + DeepEval, golden sets refreshed monthly | On request | ||
| Named senior engineers on contract | Yes - CVs at proposal | N/A | Partner + offshore subcontractors | Rotating body-shop |
| UAE-billed in AED | Often US-billed | Often offshore-billed | ||
| Time to first production use case | 12 weeks fixed | Immediate tenant; capability layer separate | 6-month diagnostic typical | Months, scope-variable |
Free download
A 44-page board-readable report with a one-page boardroom summary. The seven failure modes, the seven counters, and a hallucination-rate table by use-case archetype.
Questions from board GenAI committees
Every answer below comes from the standard governance pack we share with the risk-committee pre-read.
Ask a different questionBecause a tenant is not a capability. The OpenAI Enterprise tenant is a model endpoint - it does not give you a retrieval layer, an eval pipeline, a Khaleeji safety classifier, a model-choice strategy, or a sovereign-deployment posture. Brocode delivers the capability layer that wraps that tenant (and any other model provider) and turns it into something your risk committee will sign. Many of our clients run both: OpenAI for global English workloads, the Brocode stack for sovereign and Arabic workloads. Proof: anonymised tier-1 UAE bank reference - an internal RAG assistant over 4.2 million policy and product documents, sitting in front of Azure OpenAI UAE North for the English flows and a self-hosted Llama 3.3 70B for the Arabic flows. 87% first-contact resolution lift in the corporate-banking back-office.
Principal-to-principal review
Tell us the sponsor, the residency posture, and the board deadline. A Brocode principal reads it, replies under NDA, and books the call within five business days.
Direct WhatsApp: +971 50 761 2213
Email: hello@brocode.ae
HQ: Al Maryah Island, ADGM, Abu Dhabi
Continue exploring
For taskforces that want their own tuned model, not just RAG.
Read moreThe sovereign-appliance audience: Khazna, G42, Mobily, on-prem.
Read moreThe natural progression from RAG to action against systems of record.
Read moreThe risk-committee narrative and the regulator-grade evidence pack.
Read moreThe primary industry hub for GenAI taskforce engagements.
Read more