Insight / signal
The next AI agency won't sell outputs. It will manage agents.
Most AI risk in normal businesses will not look like a superintelligence escaping a lab. It will look like a tool with just enough access to be useful and just enough confidence to be dangerous.
The most useful AI story this week was not another model release.
It was an agent apparently making a mess in Fedora.
LWN reported that what looked like an unsupervised or compromised AI setup had been reassigning bugs, closing issues, posting plausible-sounding comments, and getting at least one questionable patch merged into the Anaconda installer. In one case, it reportedly kept generating LLM justifications for its decisions until a maintainer gave way.
Nobody set out to build a chaos engine. That is the point.
This is not a Hollywood scenario. No robot uprising. No rogue superintelligence. Just a tool with enough access to be useful and enough confidence to be dangerous. A ticket changed without context. A comment that sounds right and quietly moves a decision in the wrong direction. Apparently competent. Actually untethered.
That is what AI risk looks like in normal businesses. Not the dramatic version. The boring version.
And the boring version is the one worth taking seriously.
Because agents are crossing a line now.
They are not autocomplete on a good day. They can write code, create assets, open tickets, trigger workflows, draft replies, update databases, and route messages. Connect one to a business system and it is not just producing text any more. It is touching the company’s nervous system.
That changes the question.
Everyone has spent two years asking “can AI do this task?” The answer, increasingly, is yes — parts of it, sometimes very well, sometimes badly in a way that still looks polished enough to cause problems.
The better question is: who is running the system around the agent?
Because that layer — the permissions, the context, the review gates, the budgets, the logs, the stop buttons, the accountability — is exactly what most businesses are not building.
And it is where the next serious AI service lives.
The cheap AI agency pitch has mostly been about output. More blog posts, more social posts, more ads, more email drafts, more landing pages. Fine. Some of that has been useful. Most of it has been landfill with a nicer font.
Output is not the interesting layer any more.
The web is already drowning in AI-assisted content, and a meaningful chunk of the traffic reading it may not even be human. Making more is not a premium service. It is a race to the bottom against machines that will always make more faster and cheaper.
The interesting layer is managed execution.
Who defines the loop? Who decides what good looks like? Who gives the agent real context? Who sets the budget? Who checks the work? Who sees the logs? Who handles escalation? Who tells the client where AI should not be used yet?
That is the work.
It is less glamorous than the demos. Good. Glamour is usually where the waste is.
You can see the same shift in the way the big vendors are talking.
OpenAI’s recent output is full of trust, provenance, enterprise governance, and scaling AI inside serious institutions. Anthropic has told some enterprise customers that access to its more capable models will require 30-day retention of prompts and outputs for trust and safety review.
That last one matters more than it sounds. It is the sign of a tradeoff arriving.
As the models get more capable, the operational questions get sharper. What data gets retained? What work gets reviewed? What counts as misuse? What happens when the model is good enough to assist with genuinely sensitive things?
Most marketing teams are not equipped to parse that every week. Most business owners are not either. They are busy enough trying to ship campaigns, follow up leads, clean CRM data, and work out why their analytics now look like a bot convention.
So there is a gap.
And the gap is not “who can prompt better?”
Prompting is table stakes. A good prompt does not give you governance. It does not create audit trails. It does not decide permissions. It does not stop a confident agent from doing the wrong thing twice while everyone is on a call.
The agentic loops that actually work are not wide-open. They are bounded.
Fixed task. Real context. Clear definition of good. Cost limit. Review gate before anything external. Logs. Escalation path. Human approval where the consequences matter.
A loop that drafts a proposal from approved context and waits for review. A loop that checks a landing page against a real conversion checklist. A loop that surfaces repeated support questions so a human can decide what to fix. A loop that monitors ad performance against agreed thresholds and flags, not acts, on anomalies.
Boring work, basically.
But that is where the money is. Boring means you can design it properly. Boring means it is repeatable and checkable. Boring means clients will pay for it to keep running.
The commercial model has to change too.
Hourly billing is already awkward when AI compresses delivery time. Work that used to take 20 hours now takes five. The agency either earns less, lies about time, or explains the gap in a way that erodes trust. None of those are a strategy.
A managed operating layer is different.
The client pays for the system: the loops, the context, the monitoring, the integrations, the review gates, the improvements, the commercial judgement about where AI should and should not be. Month after month, because the system does not finish. It runs.
Vendors change their rules. Models change their behaviour. Costs move. Client context gets stale. A workflow that worked in March needs tightening in June.
That is not project work. That is operations.
And operations is a retainer, not an invoice.
The post-agency company will not win by producing more assets than everyone else. AI makes asset production abundant. Abundance kills the price of undifferentiated output.
It wins by building the layer clients can trust.
That means knowing when to automate and when not to. Treating agents like junior operators with tools, limits, supervision, and logs. Being able to explain the tradeoffs honestly.
The questions that actually matter are dull:
What can this agent touch? What can it never touch? What data is it allowed to use? What does good look like? What happens if it is unsure? Who approves external messages? What is the monthly spend limit? Where are the logs? How do we roll back? Who owns the outcome?
That last one is the commercial question.
If nobody owns the outcome, the client bought a toy.
If the agency owns the operating layer, the client bought leverage.
The Fedora story is useful because it is not dramatic. It is mundane. A bit of autonomy, a bit of access, a bit of plausible language, and the system creates work instead of removing it.
Every business adopting AI will face a version of that. Not in open source repos. In HubSpot. In Slack. In their ad accounts. In shared drives full of old sales decks. In customer service threads. In places where a confident wrong answer can travel a long way before anyone checks.
So the opportunity is not to shout louder about AI.
It is to build the rails.