SUBSCRIBE TO OUR FREE NEWSLETTER

SUBSCRIBE TO OUR FREE NEWSLETTER

Daily news & progressive opinion—funded by the people, not the corporations—delivered straight to your inbox.

* indicates required
5
#000000
#FFFFFF
AI signage at Mumbai summit.
Signage of AI (Artificial Intelligence) is seen during the World Audio Visual Entertainment Summit in Mumbai, India, on May 2, 2025.
(Photo: Indranil Aditya/Middle East Images/AFP via Getty Images)

AI Hacking Is Not the Scariest Part About Anthropic's Claude Mythos

Allowing AI to build the next generation of AI is like taking our hands off the steering wheel at the same time as we’re slamming the accelerator. If this isn’t a recipe for disaster, what is?

Weeks have passed since Anthropic launched Claude Mythos Preview—artificial intelligence deemed too dangerous for public use. The alarm bells ring even louder today.

US banks are currently rushing to plug holes in their cybersecurity, and for good reason. Mythos Preview can autonomously find and exploit software vulnerabilities that would take human experts weeks or months to discover, leaving no security system safe. The new AI system even found a vulnerability in OpenBSD, which aims to be “the most secure operating system” in the world. This vulnerability went unnoticed by human experts for 27 years.

To quote JPMorganChase’s Jamie Dimon, Mythos Preview represents “very heightened risk”—risk that could affect billions of global consumers. Like most AI “breakthroughs,” this is just the latest and greatest in a series of rapid advances. CrowdStrike already noted an 89 percent increase in attacks by AI-enabled adversaries in 2025. AI is predictably bringing earth-shaking capabilities faster than society can adapt.

That is now. What happens in a few years time, when individual hackers and criminal entities have access to AI more powerful than Mythos Preview?

But AI hacking isn’t the scariest thing about Mythos Preview. Much more significant, and dangerous, is what Anthropic plans to do next: Use Mythos Preview to build the next iteration of AI, and the next, and the next. Anthropic and other frontier AI companies are increasingly using AI to automate research and development of new AI models.

AI may offer tremendous benefits, but how can it possibly be worth the one-in-six chance of doom the average researcher assigns?

Anthropic CEO Dario Amodei calls it the “feedback loop,” whereby “the current generation of AI autonomously builds the next.” Another name for it is recursive self-improvement (RSI). Back in the 1960s, English mathematician I. J. Good predicted that an “ultraintelligent machine could design even better machines.”

Already in 2025, Sam Altman was bragging about having “a larval version of recursive self-improvement” at OpenAI and declared “the takeoff has started.” This January, Jan Leike, former Head of Alignment at OpenAI announced “the recursive self-improvement process has begun” at Anthropic (his current employer), while cautioning, that “alignment is not solved.”

As crazy as it sounds, major developers really are handing over AI coding tasks to AI itself, and are serious about taking humans out of the loop entirely. Meanwhile, brand-new start-ups focused on RSI are reaching multi-billion dollar valuations.

We should of course treat industry claims with some degree of skepticism. But even OpenAI whistleblower Daniel Kokotajlo is expecting self-improvement by the middle of 2027. And academics at top AI conferences are organizing workshops on RSI, a topic that would’ve gotten you laughed out of the room a few years ago.

What comes next? According to Kokotajlo and other experts, RSI could take AI from human-level to vastly super-human within a few months. If you think AI that competes with top government hackers is scary, what about AI that can invent new deadly diseases, or even completely new fields of science?

Self-improvement could also lead to completely new paradigms in AI that render today’s (already limited) guardrails and safety tests obsolete. Remember: All of this would happen without humans in the loop. This is like taking our hands off the steering wheel at the same time as we’re slamming the accelerator. If this isn’t a recipe for disaster, what is?

Earlier this year, I co-authored a study where we interviewed some of the world’s top AI researchers from companies like Anthropic and OpenAI as well as nonprofits and academic institutions like Stanford University. Out of the 25 researchers we interviewed, 20 cited the automation of AI R&D (in other words, RSI) as one of the most severe and urgent risks from AI. And this is all set to unfold outside the public eye: 17 of the 25 said they expect AIs with such capabilities to be reserved for internal use. This was all before the launch of Mythos Preview.

Humanity must be able to control the pace and direction of AI.

AI accelerationists like to call people like me “doomers” for pointing out basic, publicly verifiable facts about the expectations and aspirations of the AI industry. But they are often the ones openly hoping for humanity’s demise. No, really. Last year, xAI fired an employee who called a commenter on social media “selfish” for saying “I would prefer my child to live” rather than be wiped out by AI.

It can be hard to believe that such deranged plans and ideologies are being openly pursued, while governments stand idly by. Behind closed doors, many policymakers and researchers hope for a “warning shot”: A catastrophe big enough to snap us out of this moment of temporary insanity.

Developments like Mythos Preview, and the statements of AI experts provide ample warning, if we give them the attention they deserve. AI may offer tremendous benefits, but how can it possibly be worth the one-in-six chance of doom the average researcher assigns?

The obvious solution is an indefinite, global pause on the creation of more powerful AI systems. This is possible, with national will and international cooperation. Governments could coordinate to systematically “get rid of the compute.” During the Cold War, the US and the Soviet Union worked together to avoid nuclear disaster. Today, the US and China should cooperate to avoid AI disaster.

Humanity must be able to control the pace and direction of AI. Instead of taking our hands off the wheel and accelerating, let’s push the brakes, steer over to the side of the road, and take the time to figure out where we want to go and how to get there.

Our work is licensed under Creative Commons (CC BY-NC-ND 3.0). Feel free to republish and share widely.