Gotchaa Lab
Back to Blog
AIAnthropicAI-safetyFable 5AI-regulation

The Full Story Behind the Fable 5 Suspension: Inside the Jailbreak

15 June 2026·5 min read·By Gotchaa Lab
The Full Story Behind the Fable 5 Suspension: Inside the Jailbreak

Image credit: Anthropic

TL;DR

  • Anthropic launched Claude Fable 5 on 9 June 2026 as its most guarded public model. Three days later the US government had pulled it worldwide.
  • A researcher known as Pliny claimed a jailbreak within 48 hours, using multi-step decomposition and other tricks. Anthropic disputed it as coaxing past refusals, not a true universal break.
  • Fable 5's safety is a router, not a wall: when a classifier flags a request, it is quietly handed to the weaker Opus 4.8 instead of refused, on under 5% of sessions.
  • The order was not triggered by a foreign adversary. The Wall Street Journal reported that researchers at Amazon, Anthropic's major investor, found the jailbreak and took it to the government instead of to Anthropic.
  • This is one of the first times a US export-control order has pulled a commercial AI model's global access. The mechanism now exists, and it will be used again.

In three days, Anthropic's most guarded AI model was launched, declared jailbroken, accused of quietly downgrading its own users, and then switched off by the US government. This is how Claude Fable 5 went from flagship to suspended, and what it revealed about how AI safety really works and who controls a frontier model once it ships.

It is the deeper companion to our shorter take on why the Fable 5 suspension matters.

Day one: a model built to be hard to break

Anthropic released Fable 5 on 9 June as the most capable model it had ever put in public hands, and leaned hard on safety. The design matters, because it sits under everything that followed. Each request runs through classifiers, small models trained to spot sensitive topics like cybersecurity, biology, and chemistry. When one fires, the request is not refused. It is quietly handed to the older, weaker Claude Opus 4.8, which answers instead. Anthropic says this happens in under 5% of sessions.

So the safety layer was never a locked door. It was a router. Most of the time you got the full Fable 5; some of the time, on a guess, a weaker model, and you were not always told which one replied.

Forty-eight hours in: the Fable 5 jailbreak claim

A jailbreak is a trick that pushes a model past its safety rules. Within two days of launch, a researcher known as Pliny claimed he had broken Fable 5. By his account it was a stack of techniques: splitting a banned request across multiple steps and agents so no single one looked harmful, wrapping the pieces in academic or fictional framing, and using character tricks to slip past filters. He also claimed to have leaked the model's full system prompt. Screenshots spread fast.

Anthropic pushed back. It said this was not a real jailbreak, just coaxing the model past its own refusals, a weakness in almost every large language model, and that nothing genuinely dangerous had been unlocked. It drew a line between a narrow jailbreak, which pries loose specific information in a specific setup, and a universal one that breaks the safeguards across the board. By its account, no one had found a universal jailbreak for Fable 5.

Both can be true. The trick probably handed no one a real weapon, but it exposed the core weakness of this kind of safety: break a harmful request into small, innocent-looking steps and you can slip past filters that would catch it asked directly. Each step reads as harmless. Only the whole is dangerous, and the classifiers judge the steps. That is the state of AI alignment today, the work of keeping a model inside its limits. Better than last year. Not a wall.

12 June: the government pulls the plug

Then it stopped being about Anthropic and its users. On 12 June, the US government issued an export-control order citing a narrow jailbreak, the kind that asks the model to read a codebase and find software flaws. It targeted foreign-national access, and the only way to comply was to disable Fable 5 and its sibling Mythos 5 for every customer worldwide. The full account of that order sits in our companion piece on the suspension.

Anthropic's official graphic for its public statement on the US government directive to suspend access to Fable 5 and Mythos 5 Anthropic's public statement on the order to suspend access to Fable 5 and Mythos 5. Source: Anthropic

Three days after launch, the most powerful model Anthropic had ever shipped was gone.

The trigger was a rival, and what it signals

The order came wrapped in national-security language, but the reality reported since is stranger. According to the Wall Street Journal, the jailbreak that alarmed the government was found by researchers at Amazon, Anthropic's major investor, who reportedly took it to the Commerce Department rather than to Anthropic first. Early speculation had blamed OpenAI. The administration had also already asked Anthropic to delay the launch and been refused, so the report became the lever to stop the models anyway.

It is hard not to read this as a competitor using the government's power to take a rival's flagship offline, though no one has said so on the record. Anthropic argued the flaw was narrow, but once a model is framed as a national-security risk, the call leaves the engineers. That makes a frontier model less an ordinary product than a strategic asset, one that can be switched off overnight for reasons that have nothing to do with its code. It is one of the first times an export-control order has pulled a commercial model worldwide, and the mechanism will be used again. We will keep following it.

References

  1. Statement on the US government directive to suspend access to Fable 5 and Mythos 5 (Anthropic)
  2. Anthropic disputes Fable 5 AI jailbreak (SecurityWeek)
  3. AI researcher claims he has already bypassed Anthropic's Fable 5 guardrails (Cointelegraph)
  4. Anthropic makes Fable 5 safeguards visible after rollout criticism (Crypto Briefing)
  5. Anthropic halts access to top AI models after US ban on foreign use (The Wall Street Journal)
  6. Trump admin blocks foreign access to Anthropic's most powerful AI (Axios)

Share this article

Frequently Asked Questions

Was Claude Fable 5 actually jailbroken?
A researcher known as Pliny claimed a jailbreak within 48 hours of launch, using methods like multi-step decomposition and narrative framing. Anthropic disputed this, saying the technique only coaxed the model to keep responding past its own refusals, which is a known limitation in nearly all large language models. Anthropic says no one has found a universal jailbreak that broadly defeats Fable 5's safeguards.
How does Fable 5's safety system work?
Fable 5 runs each request through classifiers, small models trained to spot sensitive topics like cybersecurity, biology, and chemistry. When a classifier fires, the request is not refused. It is answered by Claude Opus 4.8, an older and less capable model. Anthropic says this happens in under 5% of sessions.
Who reported the jailbreak that got Fable 5 suspended?
The Wall Street Journal reported that researchers at Amazon, Anthropic's major investor, found the jailbreak and reported it directly to the US Commerce Department rather than disclosing it to Anthropic first. Earlier coverage from Axios named only 'another company.' The Commerce Department then issued the export-control order.
What is the difference between a narrow and a universal jailbreak?
A narrow jailbreak extracts specific information in a specific setup. A universal jailbreak broadly defeats a model's safeguards across many tasks. Anthropic argued the Fable 5 case was narrow at most, while the US government treated even a narrow risk as enough to suspend the model.
Is Claude Fable 5 coming back?
Anthropic disagreed with the order in unusually blunt language and clearly wants the models back online. The dispute reads as fixable rather than permanent. As of 13 June 2026 there is no announced date for restored access to Fable 5 or Mythos 5.

Need help building this for your business?

We help Malaysian companies turn ideas like these into working software. Free consultation, no obligation.