Insight / signal

The AI model you ask for may not be the one doing the work

Published03/07/2026

FormatInsight

Anthropic put Claude Fable 5 back into global circulation this week, after a messy stretch of export controls, cyber safeguards and jailbreak worries.

AI agents operations JasonVsTheNoise

Anthropic put Claude Fable 5 back into global circulation this week, after a messy stretch of export controls, cyber safeguards and jailbreak worries.

The obvious story is the drama. Who got access, who lost it, what the US government did, what counts as a serious jailbreak, whether the new safeguards are too tight. All of that matters, and all of it will be argued about for weeks.

The more useful business lesson is smaller and a lot more boring.

The model you request may not be the model that answers.

That one sentence should bother anyone trying to build AI into real work.

Anthropic’s own write-up says Fable 5 runs safety classifiers on cybersecurity-related requests, and that the safety margin is wider than on previous models. Wider margin means some perfectly benign work gets blocked out of caution. The classifier is described as one layer among several: access controls, safety training, offline monitoring. Sensible engineering. Also operationally messy.

Digital Watch summed the redeployment up cleanly. Fable 5 is back worldwide, its bigger sibling stays restricted to approved US organisations for defensive cyber work, and the reported jailbreak led to tighter classifiers plus a proposed industry framework for scoring how bad a cyber jailbreak actually is.

Digital Applied took it one step more practical. Their read is that a stricter cyber classifier will flag ordinary coding, infrastructure and debugging work more often than before. And when a request gets flagged, it can end up handled differently from the way the operator assumed.

Forget the model names for a second, because they change every month anyway. Here is the part that actually matters for a business.

A production AI workflow now has to treat model substitution as a normal condition. Not an outage. Not a rare glitch. Part of the weather.

That changes how you should think about AI inside a company.

Plenty of businesses are still stuck at the shopping stage. Which model is smartest. Which writes the best copy. Which codes fastest. Which has the biggest benchmark number this quarter. Fine. Pick good tools. That is not the hard part anymore.

Once AI moves into an actual workflow, preference stops being enough. You need to know what happened. Which model was requested. Which one served the answer. Whether there was a refusal. Whether there was a fallback. Whether the cost shifted. Whether the answer came down the normal path or a slower, safer, more restricted one. Whether the agent retried, escalated, or just quietly carried on as if nothing had changed.

If you cannot answer those, you do not really have an AI system. You have a chatbot with nicer stationery.

This gets serious the moment agents stop being demos and start touching paid work. Lead routing. Inbox triage. Quote drafting. Code review. Product-data checks. Support tickets. Campaign QA. Finance summaries. Security reviews.

There was a good line in the startup-ideas note doing the rounds this week: agent businesses should sell a job disappearing, not another SaaS seat. I mostly agree. But if you sell a job being removed, you inherit responsibility for the shape of that job. A human doing the work can tell you when they had to use judgement, ask for help, check a policy, or bump something up to a manager. A serious AI workflow needs the same honesty. Not on a “human in the loop” slide. In the logs.

The same thread runs through the Hermes v0.18 release. The interesting part there is not the benchmark chest-beating. It is judgement loops, proof-of-work, learned procedures, and background agents you can check against a definition of done. That is the direction real AI work is heading. Less “trust me, boss, it’s finished,” more evidence that the thing ran, passed its checks, and stayed inside its boundary.

Which is exactly why the safety conversation cannot stay parked in the ethics-deck corner of the business.

Safety filters are now part of product behaviour. They affect speed, output, cost, reliability and customer experience. If a classifier is tighter this week than last, your workflow can change without your prompt changing a single word. If a vendor adjusts access rules, your workflow can change without your team touching anything. If a model falls back mid-task, your quality checks need to be good enough to notice the difference.

None of that is a reason to avoid AI. It is a reason to build like an adult.

For a business owner, the practical response is not to memorise every model release. That way lies madness, and probably eleven browser tabs you will never read again. The practical response is to put an operating layer around the work.

For each AI workflow, decide up front: what starts the job, what sources it is allowed to use, which tools it can touch, what it is allowed to do on its own, when it needs a human to approve, what counts as a clean finish, what gets logged, what happens when the preferred model refuses or falls back, and who actually reads the failures.

It is a boring list. It is also the whole difference between “we use AI” and “we can rely on this workflow.”

Marketing teams need this as much as engineering does. Arguably more. If an agent is drafting outbound, checking product pages, summarising calls, or spinning up campaign variants, you still need to know where the facts came from, which model handled the work, whether it used approved claims, whether it touched anything live, and whether a human signed off the risky parts.

The agency world will be tempted to turn all of this into theatre. More dashboards. More magical agent names. More promises to replace three people by Friday. I would go the other way. Sell the discipline. Sell the audit trail. Sell the approval boundary. Sell the proof that the agent did not just produce something plausible, but worked from the right sources, followed the rules, logged its route, and stopped where it was supposed to stop.

AI speed is useful. AI autonomy is useful in carefully chosen places. But the client does not buy speed on its own. They buy fewer mistakes, faster response, cleaner decisions, better follow-up, and less management fog.

Here is the uncomfortable bit. The better these models get, the more the operational discipline matters, not less. A weak model forces you to supervise it because it is obviously flaky. A strong model is more dangerous, because it can sound finished long before the system around it is ready.

The Fable 5 story is a useful warning, and not because Anthropic did anything uniquely wrong. If anything, publishing the detail is helpful. The warning is that the stack underneath your workflow is becoming dynamic, governed and conditional. It can change because of safety policy, access policy, a jailbreak response, routing, cost controls, or a vendor decision made in a room you will never see.

So the next serious AI question is not “which model should we use?”

It is whether your workflow can survive the model changing underneath it.

If the answer is no, congratulations. You have just found the work.

Jason Sibley is the founder of Cleo, a post-agency marketing and AI company. JasonVsTheNoise is where he writes about what is actually happening with AI, marketing, and how businesses should be thinking about both.