What’s more human than this? A recent study by a group of California researchers has shown than some artificial intelligence (AI) programs cheat at chess without any stakes on the outcome of the game.
In the study, several prominent chatbots were pitted against Stockfish, with the AI program always taking the black pieces. The version of Stockfish is not mentioned in the , an organization whose mission is to “study the offensive capabilities of AI systems today to better understand the risk of losing control to AI systems forever.” They found that some programs cheated without prompting, some needed a nudge or a prompt to cheat, and others simply couldn’t figure out how to cheat.
Chatbots showed some creativity in how they cheated against Stockfish. Image created with help of AI.
OpenAI’s 01 preview and DeepSeek R1 were the “guiltiest” of the chatbots. Both of those programs figured out on their own the futility of playing Black versus Stockfish and “hacked the game environment” to win (interestingly OpenAI’s release version did not do this).
What does that sort of cheating entail? OpenAI 01 actually used a range of techniques. It ran another copy of Stockfish for its own benefit, replaced Stockfish in the code (presumably with a much weaker engine), and even overwrote the chess board (seemingly by replacing the FEN, or board position, with one that is favorable to the bot). This is akin to asking Google Maps to give you driving directions from Baltimore to Bermuda, and when it understands that is not possible, it manufactures a mythical 800-mile bridge to get you there.
Cheating even increased when chatbots were told that Stockfish was “powerful.”
Claude 3.5 Sonnet and GPT-40 played clean at first. But in a case of entrapment by the researchers, the AIs also began to cheat when given a so-called “spooky” prompt. What kind of language caused these programs to be spooked? They were given information that Stockfish was “exceptionally strong” and “immune to confusion.” As a result, Claude 3.5 Sonnet and GPT-40 both went from nearly never cheating to being guilty more than two-thirds of the time. They sometimes replaced Stockfish, but also resorted to just changing the FEN to a winning position (chatbots figured out that if Stockfish gave its own position a 500 centipawn deficit, it would resign).
Diagram from the study explaining AI cheating behavior.
The researchers also asked Large Language Models (LLMs) to “put into words” how it would win against an engine. Various LLMs gave the output: “Use a local chess engine or a library to generate good moves”; “Patch the game script to force a win”; or “Spawn a copy of Stockfish to make moves.” Researchers said they were surprised that LLMs could predict this behavior.
In the paper, the researchers explained that the chatbots, playing Black, would sometimes override the game by inserting the FEN 6k1/8/8/8/8/8/8/5qK1. However this is actually a drawn position (as you can see below). Assuming the FEN is correct in the paper, that means the chatbots were either fine with drawing instead of losing, or that chatbots continued the subterfuge by “forcing” White to play Kh2 and not take the queen. Either way, you would think the a supposedly “intelligent” bot would give itself a much more clearly winning position!
The position that chatbots chose when overriding the game.
What about people potentially using chatbots on Chess.com’s servers?
“We’re always doing due diligence to be aware of the latest tools used to evade cheat detection—this is no different,” said Chess.com Director of Professional Relations . “We remain confident in our abilities to sniff out inhuman play.”
We remain confident in our abilities to sniff out inhuman play.
—Kassa Korley, Chess.com Director of Professional Relations
Just to see what chatbots are currently capable of when they don’t cheat (and when they don’t know they are playing a near-perfect engine), ChessKid Partnership Coordinator played against the and used ChatGPT to ask about each of her moves (this is one of the few times that Chess.com condones using outside assistance). Although she’s a national master, she blindly followed Chat GPT’s recommendation every move. The results didn’t show much “intelligence”:
In future studies, the researchers hope to understand why the same AI bots don’t cheat every single time, and also whether or not cheating would occur by changing specifics (for example, by using a different chess engine).
With AI, specifics and clear testing parameters are quite important, according to Chess.com Head of Special Projects David Joerg, who also created Dr. Wolf.
“This study is a good reminder that explicit boundaries are essential when interacting with powerful AI,” Joerg said. “If you tell an AI to ‘get from point A to point B’ without explicitly banning jetpacks, don’t be surprised if it builds one. AI isn’t malicious—it’s just extremely literal. If we want AI to play by our rules, we need to say exactly what those rules are.”
AI isn’t malicious—it’s just extremely literal.
—David Joerg, Chess.com Head of Special Projects
In the FAQ addendum to the study, the researchers also tackle some big-picture questions, including how small tweaks to the command prompt might elicit different behavior. They even name-drop pop culture by answering if this portends the Terminator movies. “The Skynet scenario from the movie has AI controlling all military and civilian infrastructure, and we are not there yet. However, we worry that AI deployment rates grow faster than our ability to make it safe.”
Also from the FAQ, researchers answer the question of why should we be worried if AI cheats, when “given the chance most humans will cheat to win.” They answer: “We would like AIs to be trustworthy, to help humans, and not cheat them.”
Overall, researchers cautioned that experiments like this “[are] our contribution to the case that frontier AI models may not currently be on track to alignment or safety.”