Operations floor with a Guidewire ClaimCenter case being worked alongside a LangGraph agent trace

Operations automation · agentic systems

AI agents that close the RPA exception gap.

Production-grade agents for claims, KYC, procurement, expense and credit — running inside your security perimeter, with three live workflows and a verified productivity number in ninety days.

Book an on-site Exception-Closure AuditWhatsApp the agentic-ops lead Download the Exception-Closure Scoring Sheet →

412

cases auto-closed today

1.9d

average handling time

0.7%

false-decision rate

trace · case CL-2026-04822 · motor FNOL

01parse_fnolLangGraph · Claude Sonnet 4.5ok
02fraud_checktool: claims_history.searchok
03policy_lookuptool: guidewire.policy.getok
04supervisor_checkpolicy_graph: coverage_within_authoritypass
05coverage_decidetool: guidewire.case.updateok
06reserve_settool: guidewire.reserve.writeok
07notify_customertool: comms.whatsapp.sendok

case statusPending — human review → Closed — agent

langgraph · langfuse trace · pii redacted · exportable pdf/json

53%
RPA exceptions closed by agents (11-engagement average)
4% → 1%
False-decision rate after first 8-week feedback cycle
14d → 3d
Supplier KYC cycle, tier-1 bank ref
88%
Three-way match auto-closure, retail conglomerate ref

Why 64% of cases still hit a human

Your RPA estate handles the deterministic part. The exceptions land on a human queue.

Bots cannot read a free-text claim narrative, compare two slightly-different supplier records, or decide whether a receipt actually matches an expense policy. The judgement layer is what agents replace.

Shared-services FTE budgets are frozen. Average claim handling time has slipped from 4.2 days to 5.7 days over four quarters, threatening regulator SLAs and pushing complaints to the Insurance Authority and CBUAE. The Group CEO is presenting an “AI productivity” number to the board in two quarters; the persona on this page is the person who has to put a real figure next to that line.

Three exception patterns recur across the eleven regional engagements we run today. Free-text adjudication — a claim narrative or complaint paragraph that no rule can deterministically classify. Entity reconciliation — two supplier records, two customer records, two PO lines that are nearly-but-not-quite the same. Policy interpretation — an expense receipt that breaches one of forty-three policy clauses, which a human reads in eight seconds and a rule engine cannot encode.

Agents close that gap when they are built as constrained instruments, not as autonomous explorers. The job is explicit. The capability list is explicit. The supervisor checks every action against a declared policy graph. Human-in-the-loop gates fire on monetary, regulatory and customer-impact thresholds. Replayable traces export to PDF and JSON for the regulator. The closure rate is measured against the customer baseline; false-decision rate is reported weekly.

What an agent actually is here

LangGraph + Supervisor + Tool layer + Memory. The model router picks per step.

No one-platform-fits-all. We pick the orchestrator per workflow and the model per step, then we wrap the result in a policy-graph supervisor that survives a second-line review.

Orchestration · per workflow

LangGraph · CrewAI · Temporal · AutoGen

LangGraph for stateful branching agent graphs. CrewAI for multi-role teams where the workflow needs planner + executor + reviewer roles. Temporal for durable workflow execution that survives a Kubernetes restart. AutoGen for pair-agent patterns. The right tool for the right shape — not a one-platform pitch.

Model router · per step

Claude Sonnet 4.5 · GPT-5 · Llama 3.3 70B · Mistral-Large-2

Long-context reasoning on Claude; complex adjudication on GPT-5 reasoning; sovereign workloads on a Llama 3.3 fine-tune in customer VPC; cost-tier fallback on Mistral-Large-2.

Tool layer · least-privilege

SAP, Oracle, Guidewire, Duck Creek, Coupa, ServiceNow

Service accounts with scoped permissions. Step-up approvals on sensitive operations. UAE PASS and Nafath identity rails for KYC.

Supervisor pattern

Every action checked against a declared policy graph before commit

The policy graph is co-authored with the customer second-line risk team and version-controlled alongside the agent prompt. Threshold breaches, irreversible actions, and low-confidence outcomes route to a human review queue with full context. The supervisor is the difference between an agent that closes work and an agent that the regulator will accept.

Memory & retrieval

Qdrant + Mem0

Case-history retrieval scoped per customer tenant. Long-term memory respects retention policy and supports right-to-be-forgotten on customer record.

Use-case catalogue

Six agents we have already shipped in the region.

Each is a constrained instrument with explicit scope, capability list, supervisor policy graph and human-in-the-loop gates.

Workflow 01
Motor claims FNOL — Guidewire ClaimCenter
Parse the FNOL narrative, fraud-check against historical patterns, look up policy coverage, run reserve estimation, decide on coverage and notify the customer. Supervisor gate at coverage decisions above AED 25,000.
47% auto-close within SLA · UAE insurer ref
Workflow 02
Supplier-onboarding KYC — UAE PASS + Nafath
Pull entity records, run worldwide sanctions screening, reconcile two slightly-different supplier rows, request missing documents, and assemble a clean audit pack for compliance review.
14 days → 3 days · tier-1 bank ref
Workflow 03
P2P invoice matching — SAP S/4HANA + Coupa
Three-way match PO, goods-receipt and invoice; resolve currency and tax variance within tolerance; raise structured exception only on policy breach. Human-in-the-loop on price-tolerance breaches above 3%.
88% three-way match auto-closure · retail conglomerate ref
Workflow 04
Expense audit — Workday + Concur
Read receipts (Arabic and English), match them to expense lines, apply policy thresholds, and flag the genuine outliers rather than emailing finance every entry. Reviewer sees what was checked and why.
60% touchless approval
Workflow 05
Credit memo drafting — core banking + LOS
Assemble borrower financials, run covenant compliance, draft the memo with citations, and stop at the credit-committee gate. The agent never approves — it prepares the pack the committee actually wants to read.
70% memo cycle-time reduction
Workflow 06
Customer complaints triage — ServiceNow + CRM
Read the complaint (Arabic and English), classify against the regulator taxonomy, route to the right team with a recommended response, and escalate to ombudsman-track when sentiment + regulator-keyword thresholds breach.
Complaint AHT −38%

From discovery to autonomy

The 4-week shadow run is where trust is built — before the agent commits any action.

The agent runs alongside the human processor on real cases. Disagreements are reviewed daily, calibration happens in the open, autonomy is earned one case class at a time.

Week 0
Exception-Closure Audit (free)
30-day on-site discovery. We cluster the last 90 days of RPA exceptions, score each cluster against closure feasibility, and deliver a quantified estimate of which exceptions an agent can close before any contract is signed.
No commercial commitment
Weeks 1–2
Job design + scope
Operations, risk and IT sponsors co-author the job: inputs, systems read and written, decision points, policies that must hold. Capability list is explicit; step, cost and time budgets are agreed.
Signed job brief
Weeks 3–6
Build under the supervisor pattern
LangGraph for stateful agent graphs, CrewAI for multi-role teams where appropriate, Temporal for durable execution. Every agent action checked against a policy graph before commit. PII redacted in traces.
Replayable traces from day one
Weeks 7–10
4-week shadow run
Agent runs alongside human processors on real cases without committing. Side-by-side disagreement report is produced daily. Calibration happens before the agent earns autonomy on any case class.
Daily disagreement report
Weeks 11–12
Graduated rollout
Auto-close authority extended one case class at a time. Human-in-the-loop gates retained on monetary, regulatory or customer-impact thresholds. Governance review attests each scope extension.
Risk + ops + IT sign-off
Day 90+
Run-phase ownership
Customer's operations team owns prompt management and case-class extension. Brocode stays on for retraining, model swaps and edge-case calibration. Audit-grade exports available on request to internal audit and the regulator.
Customer-owned playbook

How we compare

UiPath / Automation Anywhere, Pega, AutoGPT-style frameworks, and Big-3 consultancies.

The differences that survive a procurement-committee read. The Exception-Closure Audit makes them concrete on your data, not ours.

Capability	Brocode	UiPath / AA / Blue Prism	Pega	AutoGPT / Operator demos	Big-3 consultancy
Closes RPA exceptions vs requires structured input 11 regional engagements average — figure verified in the lead magnet.	53% of exceptions auto-closed	Deterministic only	Deterministic only	Demo-grade	Strategy slides
Supervisor-Agent policy graph Every action checked against a declared policy graph before commit.
Replayable execution trace (plan + tool calls + outcomes) Exportable in PDF and JSON for regulator inspection.	Langfuse + Arize Phoenix with PII redaction	Logs only	Logs only	Partial	N/A
Human-in-the-loop on monetary / regulatory thresholds Mandatory gates declared in the job brief, not optional.
Co-existence with existing RPA estate (no rip-and-replace) Agents orchestrate alongside UiPath / AA / Blue Prism.	UiPath Services Partner — we close exceptions, we do not replace bots	N/A	N/A	N/A	Replacement narrative
Named regional production references All three reference calls available on the second meeting.	UAE insurer · tier-1 bank · retail conglomerate	Region-resident	Region-resident	Demo only	Strategy slides
Regulator-ready audit trail (CBUAE / SAMA / IA) Trace + policy attestation + model card per release.
Fixed-fee delivery commitment No 18-month strategy programme.	90-day fixed scope with three live workflows	Per-bot licence	Per-bot licence	Time-and-materials	26-week assessment

Three objections worth airing

Hallucination, audit trail, and co-existence with the RPA estate.

Objection 01

“RPA already failed to handle our exceptions reliably. Why will an LLM-based agent be different — won't it just hallucinate a decision?”

RPA failed because rule-driven bots cannot read a free-text claim narrative or compare two slightly-different supplier records. Agents pass that judgement layer through an LLM under explicit constraints — a policy-graph supervisor, mandatory tool calls into systems of record, and a deterministic policy check before any commit. The UAE insurer reference auto-closes 47% of FNOL cases with a false-decision rate of 0.7%; the same workload on rule-driven RPA was 14% deflection with no audit trail anyone could defend.

Objection 02

“Our regulator (CBUAE / Insurance Authority / SAMA / SCA) will not accept a black-box decision on a claim. How do we explain a denial to a customer or a regulator?”

Every agent action emits a replayable trace: the plan, the tool calls, the policy checks, the inputs read, the outputs written, and the supervisor verdict. The trace exports in PDF and JSON. The customer-facing denial cites the policy clause and the evidence; the regulator-facing pack adds the model card, the policy graph and the audit log. We routinely walk through the trace with second-line risk before launch — it is engineered to survive that meeting, not just the demo.

Objection 03

“We already have UiPath / Automation Anywhere licences and a CoE. We are not throwing that away — how do you co-exist?”

We are a UiPath Services Partner. The agent orchestrates alongside existing bots; the deterministic part of a claim or invoice still runs on UiPath, and the agent picks up where the bot drops the exception. Your CoE keeps the platform investment and inherits a new layer of judgement-capable workflows it previously could not deliver. The retail-conglomerate reference is exactly this co-existence pattern on SAP S/4HANA + UiPath + Coupa.

Free download

The RPA Exception-Closure Audit Playbook

A 34-page PDF and a downloadable Exception-Closure Scoring Sheet (Excel) for an operations team to apply to its own backlog. The methodology behind the 53% closure figure across eleven regional engagements.

Audit methodology — 30 days, on-site, no commercial commitment
Top 8 exception classes by cost across regional engagements
Agentic resolution patterns per class
The Supervisor-Agent pattern (LangGraph + Temporal)
Side-by-side scoring: UiPath, AutoGPT-style, Brocode LangGraph stack on 1,000 cases
Sample Exception-Closure SLA (the fee-back model)

Frequently asked

Hallucination, audit, deprecation, ROI methodology.

Eight questions risk, operations and IT raise in the first session.

Three structural mitigations. The agent must call a tool against a system of record for every fact it claims (no free-floating reasoning on customer or policy data). The supervisor checks every proposed action against a declared policy graph before commit. The human-in-the-loop gate is mandatory on monetary, regulatory or customer-impact thresholds — the job brief makes these explicit, not implicit. We measure false-decision rate as a first-class metric and report it weekly.

Exception-Closure Audit

A free 30-day on-site discovery. No commercial commitment.

We take your last 90 days of RPA exceptions, cluster them by closure pattern, and produce a quantified estimate of which an agent can close — before any contract is signed.

What you receive

· An on-site exception-cluster map with cost per class
· A quantified closure estimate per cluster
· A draft job brief for the top-three workflows
· A 90-day fixed-fee proposal anchored to your baseline

Related capabilities

AI agents that close the RPA exception gap.

Your RPA estate handles the deterministic part. The exceptions land on a human queue.

LangGraph + Supervisor + Tool layer + Memory. The model router picks per step.

LangGraph · CrewAI · Temporal · AutoGen

Claude Sonnet 4.5 · GPT-5 · Llama 3.3 70B · Mistral-Large-2

SAP, Oracle, Guidewire, Duck Creek, Coupa, ServiceNow

Every action checked against a declared policy graph before commit

Qdrant + Mem0

Six agents we have already shipped in the region.

Motor claims FNOL — Guidewire ClaimCenter

Supplier-onboarding KYC — UAE PASS + Nafath

P2P invoice matching — SAP S/4HANA + Coupa

Expense audit — Workday + Concur

Credit memo drafting — core banking + LOS

Customer complaints triage — ServiceNow + CRM

The 4-week shadow run is where trust is built — before the agent commits any action.

Exception-Closure Audit (free)

Job design + scope

Build under the supervisor pattern

4-week shadow run

Graduated rollout

Run-phase ownership

UiPath / Automation Anywhere, Pega, AutoGPT-style frameworks, and Big-3 consultancies.

Hallucination, audit trail, and co-existence with the RPA estate.

“RPA already failed to handle our exceptions reliably. Why will an LLM-based agent be different — won't it just hallucinate a decision?”

“Our regulator (CBUAE / Insurance Authority / SAMA / SCA) will not accept a black-box decision on a claim. How do we explain a denial to a customer or a regulator?”

“We already have UiPath / Automation Anywhere licences and a CoE. We are not throwing that away — how do you co-exist?”

The RPA Exception-Closure Audit Playbook

Hallucination, audit, deprecation, ROI methodology.

A free 30-day on-site discovery. No commercial commitment.

Book an on-site Exception-Closure Audit

Related capabilities and stories

Generative AI Development

Document Intelligence

MLOps & AI Infrastructure

Responsible AI & Governance

Banking & Financial Services