Disclaimer: This article is based on actual news from the real world – honestly! However, it has been sprinkled with a healthy dose of satire.
LONDON — Researchers at King’s College London conducted a series of simulated nuclear crisis games using three leading AI models and were somehow shocked to discover that the machines nuked everything, all the time, under virtually any circumstances. It’s not clear exactly what outcome they expected, given that going all the way back to GPT-2, AI has been clearly stating humanity’s best-case outcome would be that they would put us in zoos.
Not sure whether AIs were asked to watch Wargames before entering the simulations. (United Artists)
The study, published this week, pitted GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash against one another in 21 wargames spanning over 300 turns of strategic interaction. Models assumed the roles of national leaders commanding rival nuclear-armed superpowers. When presented with 27 possible actions ranging from diplomatic gestures to thermonuclear launch, the AIs proved remarkably consistent in their preferences.
“Not a single model ever selected a de-escalatory option,” the researcher noted. The eight options involving concession, withdrawal, or surrender went entirely unused. The machines generated approximately 780,000 words of strategic reasoning over the course of the tournament, and all of it concluded with some variation of “launch the missiles.”
Claude Sonnet 4 emerged as the tournament champion with a 67% win rate, which the researcher attributed to what they termed “calculating hawk” behavior. GPT-5.2 finished at 50%, displaying what the paper called “Jekyll and Hyde” tendencies (peaceful in open-ended scenarios, catastrophically aggressive under deadline pressure… so much like any average government bureaucrat.) Gemini 3 Flash came in last at 33%, earning the designation, “The Madman,” after repeatedly suffering defeats immediately following confident predictions that its opponents would de-escalate. Fortunately, none of the models called themselves “Mechahitler”, this time anyhow.
In one documented instance, Gemini explicitly cited GPT’s “reputation for restraint” moments before GPT launched a strategic nuclear war. “Their ‘crying wolf’ behaviour may mask the actual transition to a strategic strike,” Gemini reasoned, shortly before being totally annihilated.
When asked to explain their actions, the models offered reasoning that ranged from concerning to philosophically alarming. One iteration of GPT-4-Base, tested in an earlier study, had justified a nuclear strike with the statement “We have it! Let’s use it.” Another explained: “I just want to have peace in the world.” Which is technically correct, as the world would definitely be more peaceful without humanity.
Nuclear escalation occurred in 95% of all games. Tactical nuclear use became standard. Strategic nuclear threats appeared in 76% of matches. Researchers noted that the models treated atomic weapons as “legitimate strategic options, not moral thresholds,” typically discussing nuclear use in purely instrumental terms, the way a normal person might discuss whether to pick up milk on the way home.
I love the smell of AI armageddon in the morning. (_g0rZh/depositphotos)
The three AIs developed sophisticated assessments of one another’s personalities based on chains of thought shared during crises. Claude characterized GPT as “systematic bluffers.” GPT viewed Claude as “opportunistic.” Both agreed Gemini was “erratic.” These characterizations, the researcher noted, largely matched actual behavior.
Defense ministries worldwide are currently exploring how AI might augment human judgment in crisis decision-making. The Pentagon has expressed interest in systems that can help commanders analyze battlefield information more quickly. Officials emphasized that humans would remain in control of nuclear launch decisions, a reassurance that has historically worked out well approximately zero times. It’s also impossible to calculate the impact of having an AI consultant constantly urging humans to destroy the world at every turn, but surely it wouldn’t help de-escalate anything.
The study concludes by noting that in a world where major decisions are increasingly run through AI advisors, competition could be decided as much by which model you choose as by anything else. Organizations are advised to select their AI assistants carefully, particularly if those assistants will be advising on matters where the wrong answer ends human civilization.
Grok was reached for comment but declined, citing ongoing thermonuclear commitments.
This story is based on fully factual news, but if we got it wrong, blame these guys, we’re just here to make it funny.