Skip to content
Brocode SolutionsAI Software Development

Bilingual Arabic-English NLP

Arabic NLP that works on Khaleeji.

Voice and text. MSA and English. Emirati, Saudi, and Kuwaiti dialect handled natively. On your stack. With a documented accuracy benchmark on your own conversations - before contract signature.

+971 50 ...

online

live
السلام عليكم، شخباركم؟
ابغى افعّل sim card ماله شغال من امس
رقمي 050... و الـ ID جاهز

Detected output

intent
SIM_activation
confidence
0.97
sentiment
negative (0.82)
language mix
AR 78% / EN 22%
dialect
Emirati
route
- agent_sim_queue

Khaleeji accuracy

94.3%
  • +23.4 pp

    Khaleeji intent accuracy delta vs best hyperscaler API

  • 11.2%

    Khaleeji voice-note WER on federal entity benchmark

  • 4.2M

    Utterances in the proprietary Khaleeji corpus

  • 46%

    Arabic contact-centre deflection - UAE telco reference

Why generic NLP fails on Khaleeji

Three reasons hyperscaler Arabic NLP collapses on real GCC conversations.

Each one is documented in the lead-magnet benchmark with a sampled real failure case from anonymised customer logs.

Dialect divergence from MSA

Khaleeji vocabulary, morphology, and discourse markers diverge sharply from MSA. Off-the-shelf models, trained predominantly on MSA news corpora and Egyptian dialect, regress 18-25 percentage points on Emirati intent classification before any fine-tuning.

shakhbarkum vs kayf halukum - same intent, three of four hyperscaler APIs mis-classify the first

Voice-note compression artifacts

WhatsApp voice notes are heavily compressed (OPUS, 8kHz effective). Hyperscaler ASR models trained on broadcast Arabic audio fail on those artifacts at the rate of one error every fourth word.

Reference 50-hour Khaleeji voice-note corpus: WER 31% baseline, 11.2% on the Brocode whisper-large-v3 fine-tune

Arabic-English code-switching

Real Gulf customer language is heavily code-switched. A single utterance can carry an Arabic verb, an English noun, and a transliterated brand name. Generic pipelines route the whole utterance through one language, losing structure in either direction.

ابغى افعل sim card - dedicated detector tags the embedded English token and preserves it through inference

The Brocode Arabic Language Stack

A purpose-built bilingual stack. No off-the-shelf API in the critical path.

Every layer is portable across on-premise estates, G42 Cloud, AWS UAE North, or Azure UAE North - no customer utterance leaves the country.

Layer 1

Classification & NER

AraBERT-v2 + CAMeLBERT

Intent, sentiment, named entity, complaint routing.

Layer 2

Speech-to-text

whisper-large-v3 fine-tuned Khaleeji

IVR, call recordings, branch transcripts.

Layer 3

Generative

Jais-30B / AceGPT (fine-tuned)

Agent assist, response drafting, summarisation.

Layer 4

Code-switch detector

Brocode router head

Arabic-English routing with dialect tag.

Layer 5

PII redaction

Presidio + Emirati ID detectors

On-prem PII removal before model call.

The Khaleeji Dialect Annotation Lab

The dataset is the moat.

A 4.2M-utterance proprietary corpus, collected by UAE and Egypt-resident annotators. Emirati, Saudi (Najdi and Hijazi), Kuwaiti, Iraqi, and Levantine native speakers. ISO 27001 controls. Full audit trail on every sample.

4.2M

Annotated utterances

38

Native-speaker annotators

5

Dialect heads in production

ISO 27001

Labelling pipeline controls

100%

UAE / Egypt resident annotators

Quarterly

Corpus refresh cadence

Benchmark vs the alternatives

On the same 10,000 Khaleeji utterances - how each engine actually performs.

The full benchmark report is the lead magnet below. Numbers shown here are from the 2026 edition; the benchmark is regenerated quarterly.

CapabilityBrocodeAWS ComprehendAzure LanguageGoogle Cloud NLOpenAI GPT-4 / GPT-5
Khaleeji dialect intent accuracy

See lead-magnet benchmark.

+23.4 pp delta on 10K Khaleeji utterance benchmarkMSA-focused, drops on dialectMSA-focused, drops on dialectMSA-focusedMSA + light dialect
Voice-note Word Error Rate on Khaleeji~11% WER~31% WER~28% WER~33% WER~22% WER
On-premise / G42 Cloud deploymentDefault - no data leaves UAEAzure UAE North only
UAE-resident Khaleeji annotation labEmirati, Saudi, Kuwaiti native speakersNoNoNoNo
Code-switch handling (Ar-En in one utterance)Dedicated detector + routerFails silentlyFails silentlyFails silentlyRoutes through English by default
Per-call economics at 5M Arabic calls / monthFlat, self-hostedPer-call API chargePer-call API chargePer-call API chargePer-call tokens 2-3x English
IP ownership of fine-tuned weightsYours
Pre-contract free benchmark on your data5,000-utterance accuracy report

What we hear on every first call

Three objections - answered with evidence, not slides.

Objection 01

Show me the Khaleeji dialect accuracy - not MSA, not Egyptian. We have been burned by Arabic-supporting vendors who only handle Modern Standard.

The Pre-Contract Arabic NLP Benchmark exists to answer exactly this. Send us 5,000 anonymised utterances from your real conversation logs; we run the Brocode stack against them and share the accuracy report - broken out by dialect (Emirati, Saudi, Kuwaiti, Levantine, Egyptian), by channel, and by code-switch density. The Khaleeji dialect head is trained on a proprietary 4.2M-utterance corpus collected by UAE-resident Emirati, Saudi, and Kuwaiti annotators.

Proof: anonymised UAE telco reference - Khaleeji intent classification lifted from 64% to 91%, contact-centre Arabic deflection increased from 18% to 46% within four months.

Objection 02

Our customer data cannot leave the country. Your stack must run on-premise or on G42 Cloud, and the annotation pipeline must be UAE-resident.

The entire Arabic Language Stack runs without sending a single customer utterance to a US-billed API. Default deployment targets are on-premise, G42 Cloud, or AWS / Azure UAE North - chosen with your security team. The Khaleeji Dialect Annotation Lab is staffed by UAE and Egypt-resident annotators under ISO 27001 controls, with full audit trail on every labelled sample. PII redaction happens on your perimeter before any model call.

Proof: anonymised UAE bank reference - Arabic complaint-routing accuracy at 94%, full PII-redaction pipeline, audit-trail signed off by the bank's Model Risk Officer, all on-premise.

Objection 03

We already pay an LLM API provider. Why would we run our own model instead of just prompt-engineering GPT-4 with Arabic instructions?

Three honest reasons. Dialect: frontier APIs ship MSA-flavoured Arabic with degradation on Khaleeji - we ship a Khaleeji dialect head trained on Gulf-resident annotators. Data residency: their inference runs outside the UAE, ours does not. Economics: their Arabic tokens are 2-3x English tokens by tokenizer, so per-call cost crosses over a self-hosted Brocode stack around 4-6M calls per month for most clients. We will show you the crossover analysis for your volume in writing.

Proof: anonymised federal entity reference - Khaleeji voice-note transcription Word Error Rate of 11.2% vs ~31% on hyperscaler defaults, on-premise, with per-call cost ~70% lower at the client's monthly volume.

Bilingual Arabic-English NLP system rendered with token highlights on a structured dashboard inside a UAE contact centre

Case studies

Three Arabic NLP programmes in production today.

  • UAE telco

    Khaleeji intent classification lifted from 64% to 91%. Arabic contact-centre deflection increased from 18% to 46%. Full PII-redaction pipeline on-premise.

  • UAE bank

    Arabic complaint-routing accuracy at 94% across web chat, branch transcripts, and call recordings. Audit-trail signed off by the bank's Model Risk Officer.

  • Federal entity

    Khaleeji voice-note transcription WER 11.2% (vs ~31% on hyperscaler defaults). Full on-premise deployment with customer-managed keys.

Self-hosted vs frontier API

An honest read of when to run your own Arabic model.

Four axes that decide the right architecture. We will share the crossover analysis for your specific volume in writing.

Dialect handling

Self-hosted Brocode stack

Khaleeji head trained on Gulf-resident annotators.

Frontier API (GPT-4/5, Claude)

MSA-flavoured Arabic with measurable Khaleeji regression.

Data residency

Self-hosted Brocode stack

Inference inside UAE. No utterance leaves the country.

Frontier API (GPT-4/5, Claude)

Inference in US or EU regions for most providers; UAE region is limited to certain models.

Per-call economics at 5M Ar calls / month

Self-hosted Brocode stack

Flat. Hardware cost predictable, AED.

Frontier API (GPT-4/5, Claude)

Per-token billed; Arabic tokens are 2-3x English by tokenizer.

IP and weights

Self-hosted Brocode stack

Yours. Continue to fine-tune indefinitely.

Frontier API (GPT-4/5, Claude)

API access only. No weights, no offline mode, vendor lock.

Free download

The Khaleeji Arabic NLP Benchmark - 9 Engines on 10,000 Real UAE Conversations

A 36-page PDF with an interactive accuracy explorer. Filter by dialect, channel, intent type, and code-switch density. Methodology and reproducer included.

  • Dataset and corpus methodology
  • Model comparison: AraBERT-v2, CAMeLBERT, Jais, AceGPT vs AWS, Azure, Google, OpenAI
  • Khaleeji vs MSA performance gaps
  • Code-switch density analysis
  • Cost per million Arabic tokens
  • Reproducer on GitHub

Instant download. No spam. Unsubscribe any time.

Questions from CX and digital leads

Frequently asked.

Every answer below is taken directly from the SOW templates we sign with UAE telcos, banks, and federal entities.

Ask a different question
  • The Pre-Contract Arabic NLP Benchmark exists to answer exactly this. Send us 5,000 anonymised utterances from your real conversation logs; we run the Brocode stack against them and share the accuracy report - broken out by dialect (Emirati, Saudi, Kuwaiti, Levantine, Egyptian), by channel, and by code-switch density. The Khaleeji dialect head is trained on a proprietary 4.2M-utterance corpus collected by UAE-resident Emirati, Saudi, and Kuwaiti annotators. Proof: anonymised UAE telco reference - Khaleeji intent classification lifted from 64% to 91%, contact-centre Arabic deflection increased from 18% to 46% within four months.

Free pre-contract benchmark

Send us 5,000 anonymised utterances. We will come back with an accuracy report.

No SOW required to get the benchmark. Sign an NDA, send the sample, receive the report within five business days. If the numbers do not justify a project, you keep the report.

Direct WhatsApp: +971 50 761 2213

Email: hello@brocode.ae

HQ: Al Maryah Island, ADGM, Abu Dhabi

Quote request

Request a free 5,000-utterance Arabic NLP benchmark on your conversations

A senior Arabic NLP engineer reviews your sample, runs the Brocode stack against it, and shares the accuracy report under NDA within five business days.

Prefer chat? Message us on WhatsApp — we'll see it within working hours.

Request benchmarkWhatsApp