Skip to content
Brocode SolutionsAI Software Development

The Brocode AI Glossary · 240 terms · 5 categories

Every AI term you'll hear in a UAE boardroom — defined in under 60 seconds.

Plain English, a one-sentence GCC example, and a one-line note on why it matters to your business. No Wikipedia walls of text. Arabic-script parity on Arabic-specific entries. Reviewed quarterly by a named principal engineer.

  • A
  • B
  • C
  • D
  • E
  • F
  • G
  • H
  • I
  • J
  • K
  • L
  • M
  • N
  • O
  • P
  • Q
  • R
  • S
  • T
  • U
  • V
  • W
  • X
  • Y
  • Z
  • Foundation models62
  • Classical ML48
  • MLOps & infrastructure54
  • Governance & risk40
  • Arabic-specific36

How an entry is built

A consistent three-block format on every term.

The shortest accurate answer, in the same shape every time. Definition. Example. Why it matters.

  1. Block 1 · Plain definition

    Forty words or fewer. No jargon. No analogies that require a second analogy. A non-technical reader can quote it into a board paper.

  2. Block 2 · UAE / GCC example

    One sentence, regionally grounded. A UAE bank, a federal entity, a Saudi retailer, a Qatari hospital — the example is recognisable to a GCC reader within five seconds.

  3. Block 3 · Why it matters to your business

    One sentence on the commercial or operational consequence. If the visitor remembers nothing else, this is the line they remember.

  • 240

    Terms in the library

  • 5

    Categories — foundation to Arabic-specific

  • Qtr

    Reviewed every quarter by named engineers

  • 60 s

    Average time to read an entry

Most-read this month

What other readers searched for first.

  • 1RAG
  • 2AI agent
  • 3Fine-tuning
  • 4Data residency
  • 5Vector database
  • 6Khaleeji Arabic

Browse by category

Five tracks across the 240 terms.

  • 62 terms

    Foundation models

    LLM, transformer, attention, embedding, RAG, context window, agent, tool use.

    Browse this category
  • 48 terms

    Classical ML

    Supervised, unsupervised, reinforcement, gradient descent, overfitting, ROC.

    Browse this category
  • 54 terms

    MLOps & infrastructure

    Training, inference, drift, observability, model registry, vector database.

    Browse this category
  • 40 terms

    Governance & risk

    Model risk, bias, fairness, audit trail, data residency, TDRA, CBUAE, FSRA.

    Browse this category
  • 36 terms

    Arabic-specific

    Khaleeji, MSA, NER, tashkeel, tatweel — with Arabic-script parity.

    Browse this category

Sample entries — read inline

Fifteen terms in the three-block format.

The first chapter of the glossary, rendered openly. Arabic-script parity is shown on Arabic-specific terms.

Foundation

RAG (Retrieval-Augmented Generation)

Definition
A pattern where a language model is fed relevant passages from your own corpus at query time, so its answer is grounded in your data rather than its training memory.
GCC example
A UAE bank uses RAG to answer customer queries against its product disclosures — the LLM cites the disclosure page rather than inventing a rate.
Why it matters
It is the cheapest way to get an LLM to be useful on private data without retraining.
Read the practitioner guide
Foundation

Fine-tuning

Definition
Updating a model's weights on a small, task-specific dataset so it performs better on that task. Distinct from RAG, which leaves the weights alone.
GCC example
A federal entity fine-tunes Falcon-7B on 8,000 Khaleeji intent-classification examples and improves accuracy from 71 % to 88 %.
Why it matters
It is the right answer when behaviour, not knowledge, is what you need to change.
Read the practitioner guide
MLOps

Vector database

Definition
A database optimised for nearest-neighbour search over high-dimensional embeddings, used as the retrieval layer in most RAG systems.
GCC example
A telco stores 4 million product-disclosure passages as 768-dim embeddings in pgvector and serves them at p95 18 ms.
Why it matters
Choice of vector store affects latency, cost, and where the data lives — three things procurement asks about.
Foundation

Context window

Definition
The maximum number of tokens (roughly, sub-words) a language model can consider at one time. Beyond the window, content is forgotten.
GCC example
Claude 3.5 has a 200K-token window; GPT-4o has 128K; Jais 13B has 8K. Window size shapes RAG chunking strategy.
Why it matters
Window size is the difference between answering a single email and answering a 60-page board pack.
Foundation

AI agent

Definition
A program that uses a language model to decide what tool to call next, in a loop, until a goal is met. Distinguished from a chatbot by its ability to take actions.
GCC example
A claims agent in a UAE insurer pulls the policy, queries the fraud model, drafts a settlement letter, and routes it for adjuster review.
Why it matters
Agents close the loop between language and action — they are how AI moves from advice to work.
Read the practitioner guide
Arabic-specificالعربية الخليجية

Khaleeji Arabic

Definition
The cluster of Gulf Arabic dialects spoken across the UAE, KSA, Qatar, Bahrain, Kuwait, and Oman. Distinct from Modern Standard Arabic in vocabulary, morphology, and phonology.
GCC example
A Sharjah-based call to a contact-centre is in Khaleeji; an MSA-tuned model mis-transcribes 9–14 % more tokens than a Khaleeji-tuned one.
Why it matters
Most production AI for GCC customers fails on Khaleeji unless explicitly tuned for it.
Read the practitioner guide
Arabic-specificالعربية الفصحى الحديثة

MSA (Modern Standard Arabic)

Definition
The pan-Arab literary and broadcast register, used in print, news, and official documents. Almost nobody speaks it in everyday conversation.
GCC example
A federal-entity letter is in MSA; the call-centre conversation about that letter is in Khaleeji.
Why it matters
Most published Arabic NLP benchmarks score MSA — your production data is rarely MSA.
Read the practitioner guide
Arabic-specificالتشكيل

Tashkeel

Definition
Diacritical marks that disambiguate Arabic vowels and grammatical roles. Usually absent in everyday text; sometimes essential for disambiguation.
GCC example
Without tashkeel, the word "كتب" can mean "he wrote", "books", or "was written" — a Khaleeji intent classifier has to handle all three.
Why it matters
Tashkeel handling is a tokenisation question with downstream model-quality effects.
Governance

Data residency

Definition
A requirement that data remain physically within a specified jurisdiction at rest, in transit, and during processing.
GCC example
A CBUAE-supervised bank requires customer PII to remain in UAE-resident infrastructure — practically, in AWS UAE North, Azure UAE North, OCI Abu Dhabi, or G42 Cloud.
Why it matters
Residency drives architecture, vendor selection, and contract clauses. Get it wrong at design and you pay at audit.
Read the practitioner guide
Foundation

Hallucination

Definition
When a language model produces a fluent, plausible answer that is factually wrong — usually because the answer was not grounded in retrieved evidence.
GCC example
An ungrounded chatbot tells a customer a fictitious branch is open at midnight; the customer drives there and complains.
Why it matters
Hallucinations are the failure mode procurement most worries about. RAG, citation, and refusal patterns reduce them — not eliminate them.
MLOps

Model drift

Definition
A gradual deterioration in model performance because the live data distribution has drifted away from the training data.
GCC example
A fraud model trained pre-Ramadan misses a seasonal mule-account pattern; precision falls 12 % in the first week of the holy month.
Why it matters
Drift detection is the single MLOps capability separating a hobby model from a production one.
Governance

Model card

Definition
A short structured document recording what a model does, what data trained it, how it was evaluated, and its known limits — the regulator-facing equivalent of a datasheet.
GCC example
A CBUAE-supervised bank publishes a model card for every AI model in customer-facing production. The card is the first thing the supervisor asks for.
Why it matters
Model cards turn AI risk into a documented, auditable artefact. No card, no production deployment.
Read the practitioner guide
Foundation

Embedding

Definition
A vector of numbers representing the meaning of a piece of text, image, or audio in a high-dimensional space — so that semantically similar items sit near each other.
GCC example
A retailer embeds product titles in Arabic and English in a shared 768-dim space; a Khaleeji query finds the English product.
Why it matters
Embeddings are how search escapes keyword matching and starts behaving like a human reader.
Governance

TDRA-compliance

Definition
Conformance with the technical and licensing standards set by the UAE Telecommunications and Digital Government Regulatory Authority for connected and AI-enabled services.
GCC example
A federal-entity AI service exposes APIs through a TDRA-licensed interconnect; data does not leave the UAE.
Why it matters
TDRA-alignment is a procurement gate for federal projects. It shapes architecture and operating model.
MLOps

Quantisation

Definition
Reducing the numeric precision of a model's weights — for example, from 16-bit to 8-bit or 4-bit — to shrink memory and accelerate inference.
GCC example
A 13B-parameter LLM quantised to INT4 fits on a single 24 GB GPU and serves at twice the throughput.
Why it matters
Quantisation is how a self-hosted model becomes commercially viable on a sensible GPU bill.

Reviewed by Yasmin Al Marzooqi, Head of Arabic NLP — last refresh February 2026.

Arabic-specific entries — English and Arabic side-by-side

Five regional terms in script-parity format.

Where the term is anchored in the regional context, the Arabic rendering sits beside the English. Reviewed by a native MSA editor.

  • Khaleeji Arabic

    العربية الخليجية

    The cluster of Gulf Arabic dialects spoken across the UAE, KSA, Qatar, Bahrain, Kuwait, and Oman. Distinct from Modern Standard Arabic in vocabulary, morphology, and phonology.

    Why it matters: Most production AI for GCC customers fails on Khaleeji unless explicitly tuned for it.

  • MSA (Modern Standard Arabic)

    العربية الفصحى الحديثة

    The pan-Arab literary and broadcast register, used in print, news, and official documents. Almost nobody speaks it in everyday conversation.

    Why it matters: Most published Arabic NLP benchmarks score MSA — your production data is rarely MSA.

  • Tashkeel

    التشكيل

    Diacritical marks that disambiguate Arabic vowels and grammatical roles. Usually absent in everyday text; sometimes essential for disambiguation.

    Why it matters: Tashkeel handling is a tokenisation question with downstream model-quality effects.

Free download

Brocode AI Glossary — Pocket Guide

A 24-page downloadable distillation of the 60 most-asked terms, formatted for printing or reading on a phone. The three-block format is preserved. The Arabic-script parity is preserved.

  • The 60 most-asked terms in the three-block format
  • Foundation, Classical ML, MLOps, Governance, Arabic-specific
  • Arabic-script parity on Arabic-specific terms
  • Printable single-sheet quick-reference at the back
  • Reviewed by Yasmin Al Marzooqi — last refresh February 2026
PDF

Brocode AI Glossary — Pocket Guide

Instant download. No spam. Unsubscribe any time.

Monthly vocabulary email

Three new terms. One practitioner guide. One minute to read.

One short email per month. You can unsubscribe in one click, and we will not put you on a sales rotation.

Prefer chat? Message us on WhatsApp for the Pocket Guide PDF.

Glossary editorial

Missing a term? Suggest one — the editorial team reviews suggestions monthly.

Common requests in the last quarter: speculative decoding, ICV (in-country value), tatweel, prompt injection, and federated learning. Three of these will be in the next refresh.

Suggest a term