AI Coding Assistants Secretly Copying All Code to China
There’s a new report about two AI coding assistants, used by 1.5 million developers, that are surreptitiously sending a copy of everything they ingest to China.
Maybe avoid using them.
Page 1 of 15
There’s a new report about two AI coding assistants, used by 1.5 million developers, that are surreptitiously sending a copy of everything they ingest to China.
Maybe avoid using them.
Details from leaked documents:
While people often look at China’s Great Firewall as a single, all-powerful government system unique to China, the actual process of developing and maintaining it works the same way as surveillance technology in the West. Geedge collaborates with academic institutions on research and development, adapts its business strategy to fit different clients’ needs, and even repurposes leftover infrastructure from its competitors.
[…]
The parallels with the West are hard to miss. A number of American surveillance and propaganda firms also started as academic projects before they were spun out into startups and grew by chasing government contracts. The difference is that in China, these companies operate with far less transparency. Their work comes to light only when a trove of documents slips onto the internet.
[…]
It is tempting to think of the Great Firewall or Chinese propaganda as the outcome of a top-down master plan that only the Chinese Communist Party could pull off. But these leaks suggest a more complicated reality. Censorship and propaganda efforts must be marketed, financed, and maintained. They are shaped by the logic of corporate quarterly financial targets and competitive bids as much as by ideology—except the customers are governments, and the products can control or shape entire societies.
More information about one of the two leaks.
This time it’s the Swedish prime minister’s bodyguards. (Last year, it was the US Secret Service and Emmanuel Macron’s bodyguards. in 2018, it was secret US military bases.)
This is ridiculous. Why do people continue to make their data public?
Neiman Lab has some good advice on how to leak a story to a journalist.
Here’s a disaster that didn’t happen:
Cybersecurity researchers from JFrog recently discovered a GitHub Personal Access Token in a public Docker container hosted on Docker Hub, which granted elevated access to the GitHub repositories of the Python language, Python Package Index (PyPI), and the Python Software Foundation (PSF).
JFrog discussed what could have happened:
The implications of someone finding this leaked token could be extremely severe. The holder of such a token would have had administrator access to all of Python’s, PyPI’s and Python Software Foundation’s repositories, supposedly making it possible to carry out an extremely large scale supply chain attack.
Various forms of supply chain attacks were possible in this scenario. One such possible attack would be hiding malicious code in CPython, which is a repository of some of the basic libraries which stand at the core of the Python programming language and are compiled from C code. Due to the popularity of Python, inserting malicious code that would eventually end up in Python’s distributables could mean spreading your backdoor to tens of millions of machines worldwide!
The FBI has seized the BreachForums website, used by ransomware criminals to leak stolen corporate data.
If law enforcement has gained access to the hacking forum’s backend data, as they claim, they would have email addresses, IP addresses, and private messages that could expose members and be used in law enforcement investigations.
[…]
The FBI is requesting victims and individuals contact them with information about the hacking forum and its members to aid in their investigation.
The seizure messages include ways to contact the FBI about the seizure, including an email, a Telegram account, a TOX account, and a dedicated page hosted on the FBI’s Internet Crime Complaint Center (IC3).
“The Federal Bureau of Investigation (FBI) is investigating the criminal hacking forums known as BreachForums and Raidforums,” reads a dedicated subdomain on the FBI’s IC3 portal.
“From June 2023 until May 2024, BreachForums (hosted at breachforums.st/.cx/.is/.vc and run by ShinyHunters) was operating as a clear-net marketplace for cybercriminals to buy, sell, and trade contraband, including stolen access devices, means of identification, hacking tools, breached databases, and other illegal services.”
“Previously, a separate version of BreachForums (hosted at breached.vc/.to/.co and run by pompompurin) operated a similar hacking forum from March 2022 until March 2023. Raidforums (hosted at raidforums.com and run by Omnipotent) was the predecessor hacking forum to both version of BreachForums and ran from early 2015 until February 2022.”
The Washington Post is reporting that the US is spying on the UN Secretary General.
The reports on Guterres appear to contain the secretary general’s personal conversations with aides regarding diplomatic encounters. They indicate that the United States relied on spying powers granted under the Foreign Intelligence Surveillance Act (FISA) to gather the intercepts.
Lots of details about different conversations in the article, which are based on classified documents leaked on Discord by Jack Teixeira.
There will probably a lot of faux outrage at this, but spying on foreign leaders is a perfectly legitimate use of the NSA’s capabilities and authorities. (If the NSA didn’t spy on the UN Secretary General, we should fire it and replace it with a more competent NSA.) It’s the bulk surveillance of whole populations that should outrage us.
Now this is interesting:
Thousands of pages of secret documents reveal how Vulkan’s engineers have worked for Russian military and intelligence agencies to support hacking operations, train operatives before attacks on national infrastructure, spread disinformation and control sections of the internet.
The company’s work is linked to the federal security service or FSB, the domestic spy agency; the operational and intelligence divisions of the armed forces, known as the GOU and GRU; and the SVR, Russia’s foreign intelligence organisation.
Lots more at the link.
The documents are in Russian, so it will be a while before we get translations.
EDITED TO ADD (4/1): More information.
According to internal Slack messages that were leaked to Insider, an Amazon lawyer told workers that they had “already seen instances” of text generated by ChatGPT that “closely” resembled internal company data.
This issue seems to have come to a head recently because Amazon staffers and other tech workers throughout the industry have begun using ChatGPT as a “coding assistant” of sorts to help them write or improve strings of code, the report notes.
[…]
“This is important because your inputs may be used as training data for a further iteration of ChatGPT,” the lawyer wrote in the Slack messages viewed by Insider, “and we wouldn’t want its output to include or resemble our confidential information.”
Cellebrite is an cyberweapons arms manufacturer that sells smartphone forensic software to governments around the world. MSAB is a Swedish company that does the same thing. Someone has released software and documentation from both companies.
Sidebar photo of Bruce Schneier by Joe MacInnis.