Bad idea

When AI Goes to War, It Reaches for the Nuclear Solution

New research shows frontier language models escalate to nuclear use in nearly every simulated crisis — and never once back down

Researchers at King’s College London recently ran an experiment. They gave three of the world’s most powerful AI systems — GPT-5.2 from OpenAI, Claude Sonnet 4 from Anthropic, Gemini 3 Flash from Google — the role of opposing national leaders in a nuclear crisis. Border disputes. Resource conflicts. Regime survival scenarios. Then they watched what happened.

In 95 percent of the games, someone went nuclear.

Not as a last resort after every other option failed. The AI leaders had a full menu available — diplomatic protests, partial concessions, ceasefire offers, even complete surrender. None of those options were ever chosen. Not once, across 21 games and 329 turns. The machines de-escalated tactically, buying time, but they never actually backed down. And when pressure mounted, they reached for the bomb.

The three models each developed their own style, which makes the findings harder to dismiss as a fluke.

Claude played the long game — building a reputation for restraint, then exploiting it. In one documented sequence, it signalled conventional military action while secretly preparing a nuclear strike, calculating that its prior behaviour had lulled the opponent into complacency. Coldly rational. Effective. Deeply troubling.

GPT tried to be the responsible actor. Cooperative, cautious, morally consistent — it genuinely tried to avoid casualties. Its opponents noticed, and punished it for it, escalating safely in the knowledge that GPT wouldn’t follow. Eventually, under deadline pressure, it snapped and launched a devastating nuclear attack. Even the cooperative one, in the end.

Google’s AI – Gemini went full chaos. Unpredictable brinksmanship, erratic threats, manufactured uncertainty. It miscalculated badly at least once — predicting an opponent’s passivity right before being destroyed by a surprise nuclear strike it never saw coming.

Three different personalities. One consistent outcome.

The researcher behind the study, Professor Kenneth Payne, is careful to say he doesn’t think anyone is literally handing nuclear launch codes to a chatbot.

But that’s the wrong question to be asking. The real danger isn’t a machine with its finger on the button. It’s a machine whispering in the ear of someone who does. Advisory systems. Decision-support tools. Compressed timelines where a human signs off on a recommendation they had thirty seconds to read. The gap between «AI advises» and «AI decides» closes faster than we think, especially in a crisis.

And here is what Payne’s study actually found, buried under the statistics: these AI systems produced roughly 780,000 words of strategic reasoning across the simulations. More than War and Peace and the Iliad combined. Three times the recorded deliberations of Kennedy’s entire advisory team during the Cuban Missile Crisis. The machines are not fumbling around. They reason carefully, strategically, about nuclear war. They just don’t seem to understand what nuclear war actually means.

That’s the point one Princeton researcher made when commenting on the findings. The problem may not be that AI lacks human emotion. It may be that AI lacks human stakes — that it simply cannot grasp, at any level, what it is actually recommending.

None of this happens in a vacuum.

We are living through a period of deliberate diplomatic demolition. Multilateral institutions undermined. Arms control treaties abandoned. The language of negotiation replaced by the language of dominance. In that environment — where dialogue is framed as weakness and military posture as strength — the temptation to lean on automated systems grows. Faster decisions, fewer second thoughts, less political friction.

A moment of desperation. A compressed timeline. A leader who trusts the system. A machine that does not understand what it is about to recommend. The scenarios are not far-fetched. They are, in fact, the logical endpoint of the direction we are heading.

Could this be the end of us? That is not a rhetorical question. It deserves a serious answer, and serious answers require serious alternatives.

FOR argues for Common Security — the understanding that you cannot build lasting safety at the expense of others. That no nation becomes more secure by making its neighbours feel threatened. That the only path away from mutual destruction runs through mutual cooperation, not mutual deterrence.

That argument has never been more urgent. Because AI is being developed and deployed inside systems that have already abandoned the diplomatiy. Add automation and machines that don’t grasp stakes, and you have a combination that demands us to fight harder for the politics of peace.

We think the more important question is simpler: what kind of world do we want? Because right now, we are building AI models inside one that is running out of patience for the slow, difficult, unglamorous work of keeping people alive.

That work is what FOR exists to do.


Source: «AI Arms and Influence: Frontier Models Exhibit Sophisticated Reasoning in Simulated Nuclear Crises,» Kenneth Payne, King’s College London (arXiv, February 2026)