The "Secret Sabotage" Fiasco, and other controversies

June 13, 2026

Even before got shutdown Fable 5 had attracted many controversies already.

Buried in Fable 5's 319-page system card: a safeguard that silently degraded outputs when the model detected you were using it for frontier LLM development. No notification, no refusal, just quietly worse answers for an estimated 0.03% of traffic, partly justified on national-security grounds. Researchers did not take it well. Dean Ball's name for it stuck: "secret sabotage". Nathan Lambert called it "anti-science" and Hugging Face's Arthur Zucker publicly pulled his usage. The consensus take: an invisible quality downgrade is worse than a refusal, because you can't trust any output anymore.

Two days later Anthropic reversed course and flagged requests now visibly fall back to Opus 4.8, but this didn't end the trust problem. Fable 5 also ships with a new Mythos-class data policy: 30-day retention of prompts and outputs on every platform, up to two years for safety-flagged content and no zero-data-retention carve-out.

Consequences so far: Microsoft restricted employee use of Fable 5 while its lawyers review the policy, ARC Prize declined to run verified ARC-AGI evals (so GPT-5.5's 85.0% keeps that crown by forfeit), and GDPR-bound European organisations are effectively locked out.

The frontier now differentiates on terms of service.

Sources: Fortune (apology), Fortune (original backlash), Decrypt, Simon Willison, Reuters (Microsoft), retention policy, ARC Prize on X

React:

Comments

Loading comments…

Stay Updated

Get the latest AI insights delivered to your inbox. No spam, unsubscribe anytime.