Insight / signal
Prompting is giving way to agent management
For the last two years, most AI advice has been obsessed with prompts.
For the last two years, most AI advice has been obsessed with prompts.
Write better prompts. Build a prompt library. Add more context. Use this magic structure. Pretend there is a secret incantation that separates the amateurs from the enlightened.
There was some truth in it. Better instructions do help. But that phase was never going to be the main event.
The more interesting shift is happening now. AI tools are moving from answering questions to owning bits of work.
That sounds subtle. It is not.
A chatbot answers. An agent goes away and does something. It researches. It checks files. It writes code. It reviews a page. It builds a small app. It watches a task and comes back with a result. Sometimes it runs in the background while you do something else.
That is when AI stops being a writing toy and becomes an operations problem.
And operations problems are where most businesses get found out.
The useful signal this week is not one model release or another breathless demo. It is a cluster of product changes all pointing in the same direction.
Hermes has async subagents now. The old pain was obvious. You could hand a job to a child agent, but the main session froze while it worked. Fine for a quick lookup. Grim if the job takes ten minutes and you are stuck watching the spinner like a lemon. The new pattern is better. Delegate the work, get a task ID back, keep going, check status, steer it if needed, collect the result later. That is much closer to how real work actually happens.
Claude Code has been moving the same way. Subagents, background sessions, agent views, nested delegation, better handling when jobs get stuck. Underneath the feature names, the product is becoming less like a single chat thread and more like a control room for work in progress.
OpenAI is doing its version inside ChatGPT Business, Enterprise and Edu. Workspace agents are generally available. They can own workflows across tools. Builders can set safeguards. Admins can see activity and usage in the console.
None of that is sexy in the Twitter-demo sense. It is more useful than sexy.
Because once AI workers can do meaningful background work, the business question changes. It is no longer “what prompt should I use?” It becomes a longer, more honest list. What work should this agent own? What does it need to know before it starts? What tools is it allowed to touch? What does progress look like? When should it stop and ask a human? Who checks the result? What did it cost? Can we run the same workflow again next week without rebuilding the whole thing from scratch?
That is management.
Not management as in corporate theatre and meetings about meetings. Management as in making work legible, bounded and repeatable.
The best mental model I have found is boring. Treat agents like junior staff with weird superpowers.
They read fast. They draft fast. They do not get tired. They will happily do tedious jobs a human avoids. They also misunderstand vague instructions, miss context, overproduce, make confident mistakes, and sometimes wander off into a hedge.
That combination should feel familiar to anyone who has ever managed people.
You would not hire a junior marketer on Monday, give them admin access to every client account on Tuesday, and say “go and improve growth.” At least, I hope you would not. You would give them a brief. You would define the output. You would show examples. You would restrict access. You would set a review point before anything went live. You would expect some rough edges and build trust gradually.
AI agents need the same treatment, just faster and more explicit.
This is where a lot of businesses are going to make a mess. They will skip the operating layer because the demo looks easy. They will wire an agent into five tools, give it a vague growth brief, and act surprised when it produces a mountain of plausible activity with no commercial judgement behind it.
Activity is not leverage. Sometimes it is just faster waste.
There is another piece here that people avoid because it is dull. Cost.
Autonomous work still costs money. Sometimes the cost is obvious because the platform bills per token or credit. Sometimes it hides inside a subscription until the provider changes the rules.
Anthropic’s June agent billing changes are a good warning shot. Programmatic Claude Agent SDK usage, headless runs, CI jobs and third-party agent usage are being treated as a separate kind of spend. The simple test: if a session runs without a human watching each turn, treat it as autonomous usage and watch the meter.
That matters more than it sounds. A human can glance at a bad brief and ask “what do you actually want?” An agent may cheerfully spend thousands or millions of tokens trying to satisfy a vague instruction.
Bad management becomes a bill.
This is why agent management needs budgets, not just prompts. Per-task caps. Model routing. Cheaper models for basic scanning. Stronger models for judgement-heavy synthesis. Caching where it helps. Clear stop conditions. Logs a normal person can read.
If that sounds less exciting than “ten agents working while you sleep”, good. It is supposed to. The sleep-working fantasy is exactly where the waste creeps in.
This shift hits marketing harder than most, because marketing is full of messy, repeated, semi-structured work.
Research a market. Find the objections. Audit a website. Pull customer language from reviews. Draft angles. Turn a long note into posts. Check search visibility. Build a landing page variant. Prepare sales follow-up. Update the content calendar. Watch competitors. Summarise what changed this week.
None of that is a single magic prompt. It is a workflow.
The old agency model sold outputs. Posts, pages, decks, campaigns, reports. AI makes outputs cheaper, which is exactly why output-only agencies are going to feel the squeeze.
The better model is an operating layer. A practical system that takes a commercial goal and moves work through research, positioning, production, publishing, follow-up, measurement and learning.
Humans still matter in that model. More, not less. The human job just moves up a level. Taste. Direction. Commercial judgement. Saying no. Spotting when a technically correct answer is strategically useless. Knowing when a client does not need more content, they need a clearer offer. Knowing when the agent has produced twenty options and only one has a pulse.
That is the post-agency opportunity. Not “we use AI, therefore cheaper blog posts.” More like: we build the marketing operating system that makes the whole commercial loop faster and more accountable.
If I were explaining this to a business owner, I would not start with model names. I would start with the control points.
The first is task ownership. Every agent needs a job small enough to inspect. “Improve marketing” is useless. “Audit these five landing pages against the current offer and produce ten specific fixes, with evidence” is workable.
The second is context. Agents do not magically know your business. They need the offer, the audience, the tone, the proof, the banned claims, the current assets, examples and constraints. If that context is scattered across Slack, Google Drive and one person’s head, the agent will guess. Guessing is where slop is born.
The third is permissions. Read access and write access are different. Drafting and publishing are different. Suggesting a CRM change and making a CRM change are different. Most agents should start as read-and-draft workers, not autonomous operators with keys to the kingdom.
The fourth is progress visibility. If an agent is running in the background, someone needs to know whether it is running, blocked, finished, or quietly chewing budget. Status is not a nice-to-have. It is the difference between delegation and wishful thinking.
The fifth is review. Not every output needs a board meeting. But important outputs need checks. Does it match the brief? Are the claims supported? Did it use the right source? Is it safe to send to a client? Would you put your name on it?
The sixth is cost control. No serious system should run without budgets. Token budgets, time budgets, model choices, retry limits. The boring stuff that keeps an experiment from becoming a weird invoice.
The seventh is reuse. If a workflow worked once, it should be easier to run again. That is where the value compounds. Not in a single clever prompt, but in a repeatable process that gets a little better every week.
This is where the post-agency model earns its keep.
Most businesses will not build this properly by themselves. They will either dabble with chatbots or buy some bloated enterprise suite that promises governance and still leaves the real work undefined.
The gap is the practical middle. A business does not need a 90-page AI transformation deck. It needs someone to look at its actual commercial system and say the plain things. This work can be automated safely. This work needs a human checkpoint. This work is too vague and needs better inputs. This workflow is costing too much for the value it creates. This agent can draft, but not publish. This report is useless unless it feeds a sales action.
Find the work. Bound the work. Build the loop. Check the output. Improve the loop. Less theatre, more plumbing. That is the kind of system we build at Cleo, because a campaign expires and a system compounds.
The uncomfortable bit is that this makes AI less magical.
Good.
Magic is hard to sell responsibly. Systems are easier to sell, easier to improve, and easier to defend when a client asks “how does this actually work?”
The next wave of AI advantage will not belong to the people with the longest prompt library. It will belong to the people who can manage delegated machine work without losing judgement, trust or money.
Prompts still matter. But management compounds.
And right now, most companies have no idea how to manage an AI worker. That is the opening.
Jason Sibley is the founder of Cleo, a post-agency marketing and AI company. JasonVsTheNoise is where he writes about what is actually happening with AI, marketing, and how businesses should be thinking about both.