Skip to content
Brocode SolutionsAI Software Development

Personalisation engineering · GCC

A ranking stack tuned for the catalogue you actually have.

Arabic-aware, cold-start-aware, event-cycle-aware — lifts measurable CTR, AOV, retention or watch-time in the first 90 days, with a public A/B holdout your CDO can defend at the board.

Treatment CTR

6.4%

+4.3pt

Control CTR

2.1%

baseline

AOV uplift

+9.8%

p < 0.001

personalised for: returning · session 2 · Riyadh · iftar window
noon.For You

For You · ranked

  • تمر مدجول

    AED 890.94
  • iftar set

    AED 1450.91
  • عبايا — Eid

    AED 3200.88
  • AirPods Pro 2

    AED 8690.84
Treatment CTR 6.4% · Control CTR 2.1% · p < 0.001

Where regional recommenders fail

Three failure modes that kill AOV on Arabic catalogues.

Each is solved by a specific piece of the stack — not by tuning a generic recommender harder. The Lift Pilot proves the impact on your data.

Failure mode · 01

Arabic brand-spelling variants

Example: Adidas / أديداس / addidas / addidas store

Fix: Arabic title canonicaliser + bge-m3 multilingual text embeddings collapse spelling variants into a single item embedding.

Failure mode · 02

Cold-start on Ramadan launches

Example: New Ramadan-themed SKUs with three days of history.

Fix: CLIP item embeddings (multimodal) + LinUCB / Thompson Sampling contextual bandit head give a sensible first ranking on day one.

Failure mode · 03

Long-tail catalogue collapse

Example: Top-100 titles consume 92% of impressions; the other 240,000 SKUs disappear.

Fix: Two-tower neural retrieval on Vespa or Milvus, with diversity rerank — surface long-tail without dropping top-line CTR.

The CDO mandate

2.1% CTR on the personalised row. Six-month deadline.

Homepage carousel ranks by store popularity, not by user. Algolia / Coveo do relevance but not personalisation. AWS Personalize collapsed on Arabic-only customers and short-history sessions.

The board approved a year-long growth plan that assumes personalisation contributes a measurable AOV or retention lift. Missing it forces the CDO to re-baseline the whole roadmap. Two regional competitors have publicly demonstrated personalisation features — a peer retailer's “For You” tab, a peer streaming service's continue-watching strip — and analyst reviews are starting to compare. A failed POC has burned cloud spend and engineering goodwill; another visible failure costs the personalisation programme its independent budget. For streaming, licensing economics depend on watch-time on long-tail catalogues — a recommender that only surfaces top 100 titles is a renewal-negotiation problem.

Personalisation is one of the highest-return uses of machine learning in any consumer-facing business, but only when the recommendations are relevant, timely, and explainable. We build for bilingual catalogues, privacy obligations, and the seasonal rhythms of the region as first-class concerns rather than afterthoughts. Catalogues in the UAE rarely sit cleanly in one language. Product names mix Arabic and English, brand spellings vary, and customer reviews land in both languages along with Romanised Arabic. Our recommendation pipelines treat this as the default rather than an edge case.

UAE and GCC data-protection regimes are tightening and customer expectations around privacy continue to rise. We build systems that operate on minimal personal data, honour consent flags by default, and keep audit trails of which signals were used for which recommendation. Sensitive attributes — nationality, religion, health — are excluded from features unless there is a clear, lawful and documented basis for inclusion. Click-through rates are easy to optimise and easy to game; we instrument the system against the metrics that actually matter to the business — incremental revenue, basket size, repeat purchase rate, retention, or product-level margin — using proper A/B testing with statistical guardrails.

The Brocode ranking stack

Six layers, named tool by tool, chosen for the catalogue you actually have.

No black-box vendor service. The customer keeps the trained embeddings, the eval harness, and the feature definitions.

Layer · Retrieval

Two-tower (TensorFlow Recommenders) + ANN on Vespa / Milvus

Optional fall-through to Elasticsearch lexical for cold inventory. For media: multimodal item tower using CLIP and Whisper-generated transcripts.

Layer · Ranking

DeepFM · DIN · DCN-v2 · BST

Chosen per use case. CatBoost as a strong tabular fallback for thin-data segments. Learn-to-rank with multi-objective loss.

Layer · Cold-start

CLIP + bge-m3 + Arabic canonicaliser + LinUCB / Thompson Sampling

Item-content embeddings, multilingual text embeddings, brand-name canonicaliser, contextual-bandit head for new users and new items.

Layer · Session

GRU4Rec + event-cycle embeddings

Ramadan / Eid / Back-to-School / White Friday boost embeddings learned from prior-cycle behaviour.

Layer · Business rules

YAML-declared overlay + uplift modelling

Stock-out, margin floor, exclusivity declared in YAML, enforced before render. CausalML uplift so the system pushes items where treatment actually moves the user.

Layer · Serving

Triton + ONNX · Feast + Redis

< 40 ms p99 latency on the personalised row. Feature store on Feast with Redis online layer; offline store on Snowflake or BigQuery.

Eval and A/B

The uplift dashboard the CDO can defend at the board.

Offline replay with counterfactual estimation. Online holdout on GrowthBook or LaunchDarkly. Per-segment lift, statistical significance, peeking guardrails — exposed to the customer’s team from day one.

  • · 50/50 stratified holdout with declared MDE before launch
  • · Counterfactual offline replay as a parallel evidence stream
  • · Per-segment lift (cold-start, returning, long-tail, top-100)
  • · Regression suite that flags peeking and stratification breaks
  • · Uplift dashboard owned by the personalisation team, not the vendor

Public holdout — UAE retailer ref

Homepage “For You” row CTR: 2.1% → 6.4%

14-week 50/50 A/B holdout across all returning users. AOV up 9.8% versus control. Long-tail SKU impressions up 41%. Top-100 SKU revenue held flat — the lift was incremental, not cannibalised. The methodology is published; the customer can defend the number on its own merits.

CTR

6.4%

AOV

+9.8%

Long-tail

+41%

How we compare

Algolia, Coveo, AWS Personalize, Bloomreach / Dynamic Yield — on the same cold-start replay set.

The shared 240,000-SKU Arabic-heavy catalogue. Numbers below are from the lead-magnet methodology, reproducible.

CapabilityBrocodeAlgolia AI RecommendCoveoAWS PersonalizeBloomreach / Dynamic Yield
Cold-start HR@10 on Arabic-heavy 240K-SKU catalogue

Net-new users, no history — published on the shared replay set in the lead magnet.

0.410.180.22~0.200.30
Arabic brand-spelling canonicaliser

Adidas / أديداس / addidas collapsed into one item embedding.

Ramadan / Eid / White Friday event-cycle embeddings

Learned from prior-cycle behaviour, not a static feature flag.

Recommender-first vs search-first architecture

Algolia / Coveo stay for search; Brocode inserts for the personalised row.

Recommender-firstSearch-first, personalisation bolted onSearch-first, personalisation bolted onRecommender-firstRecommender-first
Customer keeps trained embeddings + eval harness

No vendor-managed model black-box.

< 40 ms p99 latency on personalised row

Triton + ONNX-exported models; everyone hits the latency target, the lift comes from elsewhere.

In-tenant deployment (UAE / KSA hyperscaler)

AWS UAE North, Azure UAE North, OCI Abu Dhabi by default.

Public A/B holdout methodology the CDO can defend

Uplift dashboard owned by the customer.

GrowthBook / LaunchDarkly harness · counterfactual offline replayVendor-reportedVendor-reportedVendor-reportedVendor-reported

Three objections worth airing

Search duplication, Arabic spelling-variants, and the 18-month build fear.

Objection 01

Algolia / Coveo already rank our catalogue. Why do we need a separate recommender — won't this duplicate effort?

Algolia and Coveo are search-first products with personalisation bolted on. The integration we recommend keeps them for search and slots Brocode in for the personalised row, the "you may also like" rail, and the email / push surfaces. Same catalogue, two specialist systems. On the shared cold-start replay set published in the lead magnet, Brocode's stack delivers HR@10 of 0.41 versus Algolia AI Recommend at 0.22.

Objection 02

Our catalogue has thousands of brand-name spelling variants and a long Arabic-only tail. AWS Personalize couldn't handle it. Can you?

Yes. The cold-start stack — CLIP item embeddings, bge-m3 multilingual text embeddings, an Arabic title canonicaliser that collapses Adidas / أديداس / addidas into one embedding, and a contextual-bandit head — is built specifically for that catalogue. The benchmark replay set in the lead magnet uses a 240K-SKU regional catalogue with the same spelling-variant problem; Brocode delivers HR@10 0.41 versus AWS Personalize at 0.18 on net-new users.

Objection 03

We need to ship in a quarter. We do not have 18 months to build a feature store and a ranking platform from scratch.

The Lift Pilot is six weeks. The production-build path is twelve weeks to first A/B test on a single surface — typically the homepage personalised row — running on Triton + ONNX with feature store on Feast + Redis. The UAE retailer reference went from 2.1% to 6.4% CTR on the personalised row in 14 weeks on a 50/50 A/B holdout. The full multi-surface estate (homepage, search, cart, push, email) lands inside 90 days.

Free download

The GCC Cold-Start Benchmark — AWS Personalize vs Algolia AI Recommend vs Coveo vs Brocode

A 32-page PDF, a downloadable replay dataset (anonymised), and a one-page CDO board-format summary. On net-new users with no history, Brocode’s two-tower + bandit stack delivers HR@10 of 0.41 versus AWS Personalize at 0.18 and Algolia AI Recommend at 0.22 on the shared replay set.

  • Test catalogue and baseline: 240K-SKU Arabic-heavy regional catalogue
  • Two-tower vs DeepFM vs DCN-v2 head-to-head
  • Multilingual CLIP + bge-m3 cold-start scoring
  • Ramadan / Eid context embeddings — how seasonality is learned
  • Triton + Vespa serving cost at GCC volumes
  • CDO board-format one-pager (printable)

Instant download. No spam. Unsubscribe any time.

Frequently asked

Cold-start, latency, A/B method, data exit.

Eight questions a CDO and a Head of Engineering share with their CISO before procurement.

  • A new user gets a sensible first ranking from session 1, not session 20. The contextual-bandit head selects from CLIP-derived item candidates conditioned on context — country, city, time of day, prior-cycle behaviour at the segment level (e.g., Ramadan iftar window in Riyadh). As behaviour accumulates, the system shifts from bandit exploration to two-tower retrieval. The transition is gradual; the customer never sees a popularity-only fallback.

Lift Pilot

Six weeks, free, on your event logs.

An offline replay lift estimate, a recommended ranking-stack architecture, and a board-format CDO summary — before any contract is signed.

What you receive

  • · Offline replay lift estimate on your event logs
  • · Per-segment cold-start vs warm-start scoring
  • · Recommended ranking-stack architecture and serving plan
  • · CDO board-format one-pager you can defend

Quote request

Start a free 6-week Lift Pilot on your event logs

Send us anonymised event logs. We will return an offline replay lift estimate, a recommended ranking-stack architecture, and a board-format CDO summary — before any contract.

Prefer chat? Message us on WhatsApp — we'll see it within working hours.

Start the Lift PilotWhatsApp