Insight / signal

AI agents are leaving the chat box. The boring bit is where the money is.

Published12/06/2026

FormatInsight

The most interesting AI news this week was not a flashy demo.

It was plumbing.

OpenAI announced plans to acquire Ona to expand Codex with secure, persistent cloud environments for long-running agents. It also made its models available through Oracle Cloud commitments, with the usual enterprise words attached: security, governance, deployment. Anthropic, meanwhile, announced a DXC alliance to put Claude into the systems used by banks, airlines, insurers, and government agencies. DXC is already running Claude inside OASIS, its managed-services platform, handling routine IT work through agentic workflows.

That is the shift worth watching.

Not “AI writes a better email.” Not “this prompt made our landing page 12% punchier.” AI is moving from the chat box into the operating layer of the business.

For the last couple of years, most businesses have treated AI like a very clever intern. Open the tool. Ask for a draft. Copy the useful bits. Tidy the mess. Repeat.

That is useful. I still do it. Everyone serious does it.

But it is not the end state.

The end state is an agent with access to a defined slice of company context, running against a specific workflow, checking its own output, logging what it did, escalating when uncertain, and pausing for a human where judgment matters. Less sexy. Much more commercially useful.

This is why “persistent cloud environment” matters even though it sounds dull. Good. Dull is where real business adoption lives.

A model in a blank chat window forgets the company every time you start a new session. A proper agent system needs memory, files, permissions, tool access, cost controls, and a clear job. It needs to know what good looks like and what it is not allowed to touch.

That is not a prompt problem. It is an operating-system problem.

The same signal came through in the podcasts I listened to this week. The best AI operators are building loops, not collecting clever questions.

One episode broke down agentic loops properly: a human sets the direction, the agent does the work, reviews itself against a defined standard, and keeps iterating inside a bounded lane. The important word is bounded. Open-ended agents are where token bills and nonsense go to breed.

Another episode made the more uncomfortable point: the real bottleneck is company knowledge extraction. Most businesses have the knowledge somewhere. It lives in old proposals, Slack threads, Notion pages, support tickets, previous campaigns, and the head of whoever has been there the longest and somehow knows everything.

The AI cannot use what the business has never organised.

That is where a lot of leaders are about to get caught out. They buy the seats, maybe hire someone with “AI” in their title, then wonder why it still feels like scattered experiments. The answer is usually dull: they have not turned their knowledge into usable context. They have not decided where humans approve and where agents can act. They have not defined what success looks like.

This is also where the agency model starts to creak.

If the value is hours spent producing assets, AI is awkward. It makes production faster, which either reduces billable time or creates pressure for the client to ask where the AI discount went. Fair enough. If what you sold was effort and effort drops, the invoice looks exposed.

But if the value is the operating layer, the conversation changes entirely.

Then the work is not “we will write you ten posts.” It is “we will build the system that researches, tests, publishes, measures, and improves with your company knowledge baked in.”

That is a different offer. It is also harder to fake.

Anyone can produce more content now. Most of it is beige. Some of it is actively damaging because it sounds like every other brand that has discovered bullet points and overconfidence. The useful work is upstream and downstream of the asset.

Upstream: what do we know, what does the market actually care about, what can we safely claim, what should the agent be allowed to use?

Downstream: did it move anything, did the right people see it, did it improve the next campaign, did it tell us something worth feeding back into the system?

That is where agents become useful. Not as magic content machines. As workers inside a designed loop.

For a business owner, I would not start with “which model should we use?” It matters, but it is not first.

Five more boring questions come first:

One — which repeatable workflow is painful enough to be worth systemising? Two — what company knowledge does that workflow need? Three — what does good output look like, and how will we score it? Four — where does a human need to approve, edit, or stop the machine? Five — what cost, risk, and quality limits do we need before this runs every day?

That is the grown-up version of AI adoption.

The demo phase was entertaining. The next phase is operational. If your AI plan is still “let’s get everyone using ChatGPT,” you are not doomed. You are just at step one.

Step two is deciding where AI should sit in the business. Step three is building the context, controls, and loops so it can do useful work without making a mess.

Less exciting than another model benchmark.

Also where the money is.

Jason Sibley is the founder of Cleo, a post-agency marketing and AI company. JasonVsTheNoise is where he writes about what is actually happening with AI, marketing, and how businesses should be thinking about both.