The $800M question nobody is asking

Salesforce just reported Agentforce at roughly $800M in annual recurring revenue, growing 169% year over year. That is not pilot money. That is production money, at scale, across thousands of enterprise deployments.

The same week, UC Berkeley researchers published results showing they could score 100% on multiple leading AI agent benchmarks — without solving a single underlying task. A separate team gamed 890 benchmark tasks with a single-character change. Meanwhile, ClawBench, a set of 153 real production website tasks, found the top-performing agent passes only a third of them.

So here is the picture: enterprises are spending at an $800M run rate on agentic systems evaluated by benchmarks that can be gamed to perfection without doing actual work.

That gap — between deployment velocity and trust infrastructure — is the story of enterprise AI right now. And almost nobody is treating it as the priority it deserves to be.

What trust infrastructure actually means

When I say trust infrastructure, I do not mean "responsible AI" slide decks or ethics committees that meet quarterly. I mean the operational layer that allows an enterprise to put an agent into a production workflow and answer, at any point, a set of very specific questions:

What did the agent do?
Why did it do it?
What data did it use to make that decision?
Who authorised it to act in that scope?
Can we reverse what it did?
Can we prove all of this to a regulator?

Today, in the majority of enterprise deployments I see, most of these questions do not have clean answers. The agent works. The demo impresses. The pilot gets approved. But the decision ledger — the structured record of what happened, why, on whose authority, and with what evidence — does not exist in a form that would survive a compliance review.

This is not a theoretical concern. It is a procurement blocker. The enterprises that have governance-ready agents will close deals that the ones running ungoverned pilots cannot.

Two countries, two approaches

The contrast between what happened in the UAE and the EU in the same two weeks makes the point clearly.

On May 20, more than 400 UAE ministers and federal officials gathered in Abu Dhabi for a national agentic AI retreat. Four live government agents were unveiled — handling tax audits, work permits, customer support, and cabinet operations. A formal governance framework, signed by Cabinet, defines each ministry's implementation responsibilities. 80,000 government workers are being trained, with agentic AI proficiency tied directly to promotions and performance reviews. The target is 50% of federal services run by agents within two years.

This is not a roadmap. It is a running system with a governance layer designed in from the start. I have been watching this unfold from Dubai, and the pace is real.

The EU took a different path. The AI Act Omnibus, agreed in early May, extended the high-risk compliance deadline to December 2027. Some are calling it breathing room. I think it is the opposite. The governance requirements did not move. GDPR enforcement continues. Banking, insurance, and healthcare regulators are already asking the hard questions. Companies that use the extension to slow down their governance work will find December 2027 arrives the same way every compliance deadline does — faster than planned, with less time to fix things than expected.

Two models: build first, govern in motion. Or regulate first, extend deadlines.

The enterprises I work with are watching both and trying to figure out which playbook actually de-risks their deployment.

The industry is starting to admit the gap

The most telling signal this week was not a product launch. It was a programme.

ServiceNow and Accenture announced a Forward Deployed Engineering programme — embedding engineers directly inside enterprise customers to move agentic AI from pilot into production. Jensen Huang called ServiceNow "the operating system of enterprise AI agents." ServiceNow itself repositioned its AI Control Tower as a security operating system for agents.

When two of the largest enterprise technology companies in the world launch a programme specifically to help customers cross the pilot-to-production gap, they are telling you something important: the software alone is not sufficient. The trust layer — governance, audit, identity, human oversight — has to be built alongside, not after.

IBM published a research paper the same week titled "Governance by Construction," arguing that governance should be embedded into agent architecture from the start, not bolted on after deployment. The phrase "by construction" is deliberate. It means the same thing the first Caden post argued about day-zero governance: if you add it later, it is always patchwork.

What this means if you are making decisions right now

If you are leading AI strategy, procurement, or risk inside an enterprise, the question is no longer whether to deploy agents. The market has moved past that. Salesforce's $800M says so. The UAE's live government agents say so. Your competitors' roadmaps say so.

The question is whether your deployment has a trust layer that can answer the six questions I listed above. If it does not — if your agents are in production without a decision ledger, without clear authority boundaries, without an audit trail that a regulator would accept — you are carrying risk that scales with every workflow you automate.

The enterprises that build governance into the architecture from day zero will be the ones that scale. The ones that bolt it on later will spend 2027 and 2028 retrofitting systems that were never designed to be auditable.

I keep coming back to the same conclusion I wrote about in the first post: this is not a technology problem. It is an operating-model problem. The technology works. The trust infrastructure does not exist yet. Building it is the harder, slower, less glamorous work. It is also the work that determines which deployments survive.

Caden Tech is a practice for enterprises moving past AI pilots toward measured operating-model transformation. Based in UAE, serving clients globally.

To discuss whether your organisation is ready for an agentic operating-model engagement, book a discovery call or write to [email protected].

The $800M question nobody is asking

What trust infrastructure actually means

Two countries, two approaches

The industry is starting to admit the gap

What this means if you are making decisions right now

Keep Reading

Caden Tech