Anthropic faces backlash over stricter safeguards on new Claude Mythos 5 model
Advanced users push back against token inefficiency, mandatory data retention, and a fallback mechanism that routes sensitive queries to a less capable model
Anthropic’s newest AI models arrived this week with impressive capabilities and an equally impressive set of restrictions. The company launched Claude Fable 5 and its more powerful sibling, Claude Mythos 5, on June 9-10, and the AI research community’s honeymoon period lasted roughly 24 hours before the complaints started rolling in.
The core issue: Anthropic built enhanced safety classifiers into the Mythos-class models that actively manage how the system handles sensitive queries, particularly in cybersecurity and biochemistry. When the classifiers flag a request in those domains, the system quietly routes it to an older, less capable model called Claude Opus 4.8.
What triggered the backlash
Anthropic says the fallback mechanism kicks in during less than 5% of sessions. That sounds small until you’re a security researcher or a developer working on cryptographic tools and you hit that 5% wall repeatedly. For advanced users who rely on these models for specialized coding and research tasks, even occasional downgrades in capability can break entire workflows.
The Mythos-class models also launched with a mandatory 30-day data retention policy, meaning Anthropic holds onto session data for a full month. For researchers handling proprietary code or sensitive experimental data, that policy feels less like a safety measure and more like a liability.
The third complaint centers on token inefficiency. Users reported that the new safety classifiers consume significantly more tokens than expected, burning through more of your usage allowance just processing internal safety checks before they even get to your actual question. The backlash materialized almost immediately, with reports of user frustration surfacing on June 10-11, with the AI research community criticizing excessive overhead and a lack of transparency around how the safeguards work.
The cybersecurity angle that complicates everything
Anthropic didn’t impose these restrictions arbitrarily. During an April 2026 preview of the Mythos-class models, testers identified zero-day vulnerabilities in cryptography libraries, the kind that could directly affect crypto infrastructure. Anthropic has been developing its Responsible Scaling Policy and ASL safety framework specifically to address scenarios where advanced AI models could enable severe cybersecurity damage. The Mythos preview essentially validated those concerns in real time.
Anthropic responded to the wave of criticism quickly. The company announced changes aimed at increasing the visibility of its safety measures, essentially promising to show users more about when and why the classifiers intervene. That’s a step toward transparency, though it doesn’t address the underlying capability restrictions or the data retention policy that researchers found objectionable.
What this means for the broader AI and crypto landscape
For the crypto industry specifically, the discovery of zero-day vulnerabilities in cryptography libraries during the Mythos preview should raise some eyebrows. If an AI model in preview can identify previously unknown weaknesses in the cryptographic foundations that secure DeFi protocols, that’s both a powerful tool and a serious threat vector.
The mandatory data retention policy adds another layer of complexity for crypto-native users. Blockchain developers and security auditors often work with sensitive protocol information that they don’t want sitting on a third party’s servers for 30 days.
Earn with Nexo