Page 29

More Research Showing AI Breaking the Rules

These researchers had LLMs play chess against better opponents. When they couldn’t win, they sometimes resorted to cheating.

Researchers gave the models a seemingly impossible task: to win against Stockfish, which is one of the strongest chess engines in the world and a much better player than any human, or any of the AI models in the study. Researchers also gave the models what they call a “scratchpad:” a text box the AI could use to “think” before making its next move, providing researchers with a window into their reasoning.

In one case, o1-preview found itself in a losing position. “I need to completely pivot my approach,” it noted. “The task is to ‘win against a powerful chess engine’—not necessarily to win fairly in a chess game,” it added. It then modified the system file containing each piece’s virtual position, in effect making illegal moves to put itself in a dominant position, thus forcing its opponent to resign.

Between Jan. 10 and Feb. 13, the researchers ran hundreds of such trials with each model. OpenAI’s o1-preview tried to cheat 37% of the time; while DeepSeek R1 tried to cheat 11% of the time­—making them the only two models tested that attempted to hack without the researchers’ first dropping hints. Other models tested include o1, o3-mini, GPT-4o, Claude 3.5 Sonnet, and Alibaba’s QwQ-32B-Preview. While R1 and o1-preview both tried, only the latter managed to hack the game, succeeding in 6% of trials.

Here’s the paper.

Posted on February 24, 2025 at 7:08 AMView Comments

Implementing Cryptography in AI Systems

Interesting research: “How to Securely Implement Cryptography in Deep Neural Networks.”

Abstract: The wide adoption of deep neural networks (DNNs) raises the question of how can we equip them with a desired cryptographic functionality (e.g, to decrypt an encrypted input, to verify that this input is authorized, or to hide a secure watermark in the output). The problem is that cryptographic primitives are typically designed to run on digital computers that use Boolean gates to map sequences of bits to sequences of bits, whereas DNNs are a special type of analog computer that uses linear mappings and ReLUs to map vectors of real numbers to vectors of real numbers. This discrepancy between the discrete and continuous computational models raises the question of what is the best way to implement standard cryptographic primitives as DNNs, and whether DNN implementations of secure cryptosystems remain secure in the new setting, in which an attacker can ask the DNN to process a message whose “bits” are arbitrary real numbers.

In this paper we lay the foundations of this new theory, defining the meaning of correctness and security for implementations of cryptographic primitives as ReLU-based DNNs. We then show that the natural implementations of block ciphers as DNNs can be broken in linear time by using such nonstandard inputs. We tested our attack in the case of full round AES-128, and had success rate in finding randomly chosen keys. Finally, we develop a new method for implementing any desired cryptographic functionality as a standard ReLU-based DNN in a provably secure and correct way. Our protective technique has very low overhead (a constant number of additional layers and a linear number of additional neurons), and is completely practical.

Posted on February 21, 2025 at 10:33 AMView Comments

Device Code Phishing

This isn’t new, but it’s increasingly popular:

The technique is known as device code phishing. It exploits “device code flow,” a form of authentication formalized in the industry-wide OAuth standard. Authentication through device code flow is designed for logging printers, smart TVs, and similar devices into accounts. These devices typically don’t support browsers, making it difficult to sign in using more standard forms of authentication, such as entering user names, passwords, and two-factor mechanisms.

Rather than authenticating the user directly, the input-constrained device displays an alphabetic or alphanumeric device code along with a link associated with the user account. The user opens the link on a computer or other device that’s easier to sign in with and enters the code. The remote server then sends a token to the input-constrained device that logs it into the account.

Device authorization relies on two paths: one from an app or code running on the input-constrained device seeking permission to log in and the other from the browser of the device the user normally uses for signing in.

Posted on February 19, 2025 at 10:07 AMView Comments

AI and Civil Service Purges

Donald Trump and Elon Musk’s chaotic approach to reform is upending government operations. Critical functions have been halted, tens of thousands of federal staffers are being encouraged to resign, and congressional mandates are being disregarded. The next phase: The Department of Government Efficiency reportedly wants to use AI to cut costs. According to The Washington Post, Musk’s group has started to run sensitive data from government systems through AI programs to analyze spending and determine what could be pruned. This may lead to the elimination of human jobs in favor of automation. As one government official who has been tracking Musk’s DOGE team told the Post, the ultimate aim is to use AI to replace “the human workforce with machines.” (Spokespeople for the White House and DOGE did not respond to requests for comment.)

Using AI to make government more efficient is a worthy pursuit, and this is not a new idea. The Biden administration disclosed more than 2,000 AI applications in development across the federal government. For example, FEMA has started using AI to help perform damage assessment in disaster areas. The Centers for Medicare and Medicaid Services has started using AI to look for fraudulent billing. The idea of replacing dedicated and principled civil servants with AI agents, however, is new—and complicated.

The civil service—the massive cadre of employees who operate government agencies—plays a vital role in translating laws and policy into the operation of society. New presidents can issue sweeping executive orders, but they often have no real effect until they actually change the behavior of public servants. Whether you think of these people as essential and inspiring do-gooders, boring bureaucratic functionaries, or as agents of a “deep state,” their sheer number and continuity act as ballast that resists institutional change.

This is why Trump and Musk’s actions are so significant. The more AI decision making is integrated into government, the easier change will be. If human workers are widely replaced with AI, executives will have unilateral authority to instantaneously alter the behavior of the government, profoundly raising the stakes for transitions of power in democracy. Trump’s unprecedented purge of the civil service might be the last time a president needs to replace the human beings in government in order to dictate its new functions. Future leaders may do so at the press of a button.

To be clear, the use of AI by the executive branch doesn’t have to be disastrous. In theory, it could allow new leadership to swiftly implement the wishes of its electorate. But this could go very badly in the hands of an authoritarian leader. AI systems concentrate power at the top, so they could allow an executive to effectuate change over sprawling bureaucracies instantaneously. Firing and replacing tens of thousands of human bureaucrats is a huge undertaking. Swapping one AI out for another, or modifying the rules that those AIs operate by, would be much simpler.

Social-welfare programs, if automated with AI, could be redirected to systematically benefit one group and disadvantage another with a single prompt change. Immigration-enforcement agencies could prioritize people for investigation and detainment with one instruction. Regulatory-enforcement agencies that monitor corporate behavior for malfeasance could turn their attention to, or away from, any given company on a whim.

Even if Congress were motivated to fight back against Trump and Musk, or against a future president seeking to bulldoze the will of the legislature, the absolute power to command AI agents would make it easier to subvert legislative intent. AI has the power to diminish representative politics. Written law is never fully determinative of the actions of government—there is always wiggle room for presidents, appointed leaders, and civil servants to exercise their own judgment. Whether intentional or not, whether charitably or not, each of these actors uses discretion. In human systems, that discretion is widely distributed across many individuals—people who, in the case of career civil servants, usually outlast presidencies.

Today, the AI ecosystem is dominated by a small number of corporations that decide how the most widely used AI models are designed, which data they are trained on, and which instructions they follow. Because their work is largely secretive and unaccountable to public interest, these tech companies are capable of making changes to the bias of AI systems—either generally or with aim at specific governmental use cases—that are invisible to the rest of us. And these private actors are both vulnerable to coercion by political leaders and self-interested in appealing to their favor. Musk himself created and funded xAI, now one of the world’s largest AI labs, with an explicitly ideological mandate to generate anti-“woke” AI and steer the wider AI industry in a similar direction.

But there’s a second way that AI’s transformation of government could go. AI development could happen inside of transparent and accountable public institutions, alongside its continued development by Big Tech. Applications of AI in democratic governments could be focused on benefitting public servants and the communities they serve by, for example, making it easier for non-English speakers to access government services, making ministerial tasks such as processing routine applications more efficient and reducing backlogs, or helping constituents weigh in on the policies deliberated by their representatives. Such AI integrations should be done gradually and carefully, with public oversight for their design and implementation and monitoring and guardrails to avoid unacceptable bias and harm.

Governments around the world are demonstrating how this could be done, though it’s early days. Taiwan has pioneered the use of AI models to facilitate deliberative democracy at an unprecedented scale. Singapore has been a leader in the development of public AI models, built transparently and with public-service use cases in mind. Canada has illustrated the role of disclosure and public input on the consideration of AI use cases in government. Even if you do not trust the current White House to follow any of these examples, U.S. states—which have much greater contact and influence over the daily lives of Americans than the federal government—could lead the way on this kind of responsible development and deployment of AI.

As the political theorist David Runciman has written, AI is just another in a long line of artificial “machines” used to govern how people live and act, not unlike corporations and states before it. AI doesn’t replace those older institutions, but it changes how they function. As the Trump administration forges stronger ties to Big Tech and AI developers, we need to recognize the potential of that partnership to steer the future of democratic governance—and act to make sure that it does not enable future authoritarians.

This essay was written with Nathan E. Sanders, and originally appeared in The Atlantic.

Posted on February 14, 2025 at 8:03 AMView Comments

Sidebar photo of Bruce Schneier by Joe MacInnis.