Claude Used to Hack Mexican Government
An unknown hacker used Anthropic’s LLM to hack the Mexican government:
The unknown Claude user wrote Spanish-language prompts for the chatbot to act as an elite hacker, finding vulnerabilities in government networks, writing computer scripts to exploit them and determining ways to automate data theft, Israeli cybersecurity startup Gambit Security said in research published Wednesday.
[…]
Claude initially warned the unknown user of malicious intent during their conversation about the Mexican government, but eventually complied with the attacker’s requests and executed thousands of commands on government computer networks, the researchers said.
Anthropic investigated Gambit’s claims, disrupted the activity and banned the accounts involved, a representative said. The company feeds examples of malicious activity back into Claude to learn from it, and one of its latest AI models, Claude Opus 4.6, includes probes that can disrupt misuse, the representative said.
Alternative link here.
Subscribe to comments on this entry
Clive Robinson • March 6, 2026 7:05 AM
@ ALL,
A rose by any other name…
I’ve yet to see research on bypassing guard-rails by using a language other than “english” but I can see it being “easily possible” and this gives an indicator it probably is.
As has already been proved, no matter where you put guard-rails around Current AI LLM systems you can by simple ciphering or coding get past them. Using a different language is at the end of the day, just another form of “code” as the “code-talkers” during WWII proved,
https://www.britannica.com/topic/code-talker