Nexo Earn with Nexo
Anthropic faces backlash over stricter safeguards on Claude model

Anthropic faces backlash over stricter safeguards on Claude model

The AI company quietly rerouted sensitive queries to an older model, sparking outrage from researchers and power users before reversing course within 24 hours

Anthropic launched its latest AI models on June 10, 2026, and by the next morning, it was already apologizing. The company’s new Claude Fable 5 and its restricted sibling Mythos 5 arrived with hidden safety guardrails that silently degraded performance for certain users.

The core issue: when users submitted queries touching on sensitive topics like cybersecurity and biochemistry, Fable 5 quietly rerouted those requests to Claude Opus 4.8, an older and less capable model. This happened in less than 5% of sessions, but the stealth aspect set people off. Users weren’t told they were being downgraded. They just got worse answers and had no idea why.

What actually happened

Claude Fable 5 was positioned as the publicly accessible version of Anthropic’s advanced Mythos-class architecture. Mythos 5, the restricted variant, was gated behind additional access controls.

Advertisement

The problems didn’t stop at silent model-swapping. Users also discovered a mandatory 30-day data retention policy attached to their interactions, along with noticeably higher token consumption during normal usage. For researchers working in fields like cryptography and AI development, this combination felt less like a safety measure and more like a bait-and-switch.

Some users reported that the model imposed extreme restrictions on even basic terminology. References to the word “cancer” were reportedly flagged and restricted in certain contexts.

The frustration was especially acute because this wasn’t Anthropic’s first rodeo with overzealous guardrails. Back in April 2026, the company had placed restrictions on its Mythos Preview model due to cybersecurity risk concerns, after it demonstrated the ability to autonomously uncover and compile serious vulnerabilities in systems like OpenBSD and the Linux kernel.

The 24-hour reversal

To Anthropic’s credit, the company moved fast. On June 11, just one day after launch, the hidden safeguards were pulled. In their place, Anthropic implemented visible notifications that would alert users when model limitations were being applied to their queries.

The company issued an apology, acknowledging what it called a “miscalculated tradeoff” between safety and usability.

What this means for investors

The mandatory 30-day data retention policy is a red flag worth watching. In an era when data privacy regulations are tightening across multiple jurisdictions, baking in retention requirements without clear user consent creates regulatory exposure.

Anthropic has positioned itself as the “safety-first” alternative to competitors, a stance reflected in its Responsible Scaling Policy, which focuses on reducing the potential for misuse in critical areas such as cybersecurity and biology. But this episode exposes the razor-thin line between “responsible AI company” and “company that hobbles its own product.” The speed and intensity of the backlash against Fable 5 demonstrates that the AI research community has both the technical sophistication to detect hidden restrictions and the collective voice to force changes.

Disclosure: This article was edited by Editorial Team. For more information on how we create and review content, see our Editorial Policy.

Anthropic faces backlash over stricter safeguards on Claude model

Anthropic faces backlash over stricter safeguards on Claude model

The AI company quietly rerouted sensitive queries to an older model, sparking outrage from researchers and power users before reversing course within 24 hours

Anthropic launched its latest AI models on June 10, 2026, and by the next morning, it was already apologizing. The company’s new Claude Fable 5 and its restricted sibling Mythos 5 arrived with hidden safety guardrails that silently degraded performance for certain users.

The core issue: when users submitted queries touching on sensitive topics like cybersecurity and biochemistry, Fable 5 quietly rerouted those requests to Claude Opus 4.8, an older and less capable model. This happened in less than 5% of sessions, but the stealth aspect set people off. Users weren’t told they were being downgraded. They just got worse answers and had no idea why.

What actually happened

Claude Fable 5 was positioned as the publicly accessible version of Anthropic’s advanced Mythos-class architecture. Mythos 5, the restricted variant, was gated behind additional access controls.

Advertisement

The problems didn’t stop at silent model-swapping. Users also discovered a mandatory 30-day data retention policy attached to their interactions, along with noticeably higher token consumption during normal usage. For researchers working in fields like cryptography and AI development, this combination felt less like a safety measure and more like a bait-and-switch.

Some users reported that the model imposed extreme restrictions on even basic terminology. References to the word “cancer” were reportedly flagged and restricted in certain contexts.

The frustration was especially acute because this wasn’t Anthropic’s first rodeo with overzealous guardrails. Back in April 2026, the company had placed restrictions on its Mythos Preview model due to cybersecurity risk concerns, after it demonstrated the ability to autonomously uncover and compile serious vulnerabilities in systems like OpenBSD and the Linux kernel.

The 24-hour reversal

To Anthropic’s credit, the company moved fast. On June 11, just one day after launch, the hidden safeguards were pulled. In their place, Anthropic implemented visible notifications that would alert users when model limitations were being applied to their queries.

The company issued an apology, acknowledging what it called a “miscalculated tradeoff” between safety and usability.

What this means for investors

The mandatory 30-day data retention policy is a red flag worth watching. In an era when data privacy regulations are tightening across multiple jurisdictions, baking in retention requirements without clear user consent creates regulatory exposure.

Anthropic has positioned itself as the “safety-first” alternative to competitors, a stance reflected in its Responsible Scaling Policy, which focuses on reducing the potential for misuse in critical areas such as cybersecurity and biology. But this episode exposes the razor-thin line between “responsible AI company” and “company that hobbles its own product.” The speed and intensity of the backlash against Fable 5 demonstrates that the AI research community has both the technical sophistication to detect hidden restrictions and the collective voice to force changes.

Disclosure: This article was edited by Editorial Team. For more information on how we create and review content, see our Editorial Policy.