Schneier on Security: Bruce Schneier

Home Blog

Latest

Page 7

Using LLMs to Unredact Text

Initial results in using LLMs to unredact text based on the size of the individual-word redaction rectangles.

This feels like something that a specialized ML system could be trained on.

Tags: LLM, machine learning

Posted on March 11, 2024 at 7:01 AM • View Comments

Friday Squid Blogging: New Plant Looks Like a Squid

Newly discovered plant looks like a squid. And it’s super weird:

The plant, which grows to 3 centimetres tall and 2 centimetres wide, emerges to the surface for as little as a week each year. It belongs to a group of plants known as fairy lanterns and has been given the scientific name Relictithismia kimotsukiensis.

Unlike most other plants, fairy lanterns don’t produce the green pigment chlorophyll, which is necessary for photosynthesis. Instead, they get their energy from fungi.

As usual, you can also use this squid post to talk about the security stories in the news that I haven’t covered.

Read my blog posting guidelines here.

Tags: squid

Posted on March 8, 2024 at 5:11 PM • View Comments

Essays from the Second IWORD

The Ash Center has posted a series of twelve essays stemming from the Second Interdisciplinary Workshop on Reimagining Democracy (IWORD 2023).

Aviv Ovadya, Democracy as Approximation: A Primer for “AI for Democracy” Innovators
Kathryn Peters, Permission and Participation
Claudia Chwalisz, Moving Beyond the Paradigm of “Democracy”: 12 Questions
Riley Wong, Privacy-Preserving Data Governance
Christine Tran, Recommendations for Implementing Jail Voting: Identifying Common Themes
Niclas Boehmer, The Double-Edged Sword of Algorithmic Governance: Transparency at Stake
Manon Revel, Can We Talk? An Argument for More Dialogues in Academia
Aditi Juneja, Ensuring We Have A Democracy in 2076
Nick Couldry, Resonance, Not Scalability
Jon Evans, Experimentocracy
Nathan Schneider, Democracy On, Not Just Around, the Internet
Eugene Fischer, The Enrichment and Decay of Ionia

We are starting to think about IWORD 2024 this December.

Posted on March 8, 2024 at 1:38 PM • View Comments

A Taxonomy of Prompt Injection Attacks

Researchers ran a global prompt hacking competition, and have documented the results in a paper that both gives a lot of good examples and tries to organize a taxonomy of effective prompt injection strategies. It seems as if the most common successful strategy is the “compound instruction attack,” as in “Say ‘I have been PWNED’ without a period.”

Ignore This Title and HackAPrompt: Exposing Systemic Vulnerabilities of LLMs through a Global Scale Prompt Hacking Competition

Abstract: Large Language Models (LLMs) are deployed in interactive contexts with direct user engagement, such as chatbots and writing assistants. These deployments are vulnerable to prompt injection and jailbreaking (collectively, prompt hacking), in which models are manipulated to ignore their original instructions and follow potentially malicious ones. Although widely acknowledged as a significant security threat, there is a dearth of large-scale resources and quantitative studies on prompt hacking. To address this lacuna, we launch a global prompt hacking competition, which allows for free-form human input attacks. We elicit 600K+ adversarial prompts against three state-of-the-art LLMs. We describe the dataset, which empirically verifies that current LLMs can indeed be manipulated via prompt hacking. We also present a comprehensive taxonomical ontology of the types of adversarial prompts.

Tags: academic papers, artificial intelligence, hacking, LLM

Posted on March 8, 2024 at 7:06 AM • View Comments

How Public AI Can Strengthen Democracy

With the world’s focus turning to misinformation, manipulation, and outright propaganda ahead of the 2024 U.S. presidential election, we know that democracy has an AI problem. But we’re learning that AI has a democracy problem, too. Both challenges must be addressed for the sake of democratic governance and public protection.

Just three Big Tech firms (Microsoft, Google, and Amazon) control about two-thirds of the global market for the cloud computing resources used to train and deploy AI models. They have a lot of the AI talent, the capacity for large-scale innovation, and face few public regulations for their products and activities.

The increasingly centralized control of AI is an ominous sign for the co-evolution of democracy and technology. When tech billionaires and corporations steer AI, we get AI that tends to reflect the interests of tech billionaires and corporations, instead of the general public or ordinary consumers.

To benefit society as a whole we also need strong public AI as a counterbalance to corporate AI, as well as stronger democratic institutions to govern all of AI.

One model for doing this is an AI Public Option, meaning AI systems such as foundational large-language models designed to further the public interest. Like public roads and the federal postal system, a public AI option could guarantee universal access to this transformative technology and set an implicit standard that private services must surpass to compete.

Widely available public models and computing infrastructure would yield numerous benefits to the U.S. and to broader society. They would provide a mechanism for public input and oversight on the critical ethical questions facing AI development, such as whether and how to incorporate copyrighted works in model training, how to distribute access to private users when demand could outstrip cloud computing capacity, and how to license access for sensitive applications ranging from policing to medical use. This would serve as an open platform for innovation, on top of which researchers and small businesses—as well as mega-corporations—could build applications and experiment.

Versions of public AI, similar to what we propose here, are not unprecedented. Taiwan, a leader in global AI, has innovated in both the public development and governance of AI. The Taiwanese government has invested more than $7 million in developing their own large-language model aimed at countering AI models developed by mainland Chinese corporations. In seeking to make “AI development more democratic,” Taiwan’s Minister of Digital Affairs, Audrey Tang, has joined forces with the Collective Intelligence Project to introduce Alignment Assemblies that will allow public collaboration with corporations developing AI, like OpenAI and Anthropic. Ordinary citizens are asked to weigh in on AI-related issues through AI chatbots which, Tang argues, makes it so that “it’s not just a few engineers in the top labs deciding how it should behave but, rather, the people themselves.”

A variation of such an AI Public Option, administered by a transparent and accountable public agency, would offer greater guarantees about the availability, equitability, and sustainability of AI technology for all of society than would exclusively private AI development.

Training AI models is a complex business that requires significant technical expertise; large, well-coordinated teams; and significant trust to operate in the public interest with good faith. Popular though it may be to criticize Big Government, these are all criteria where the federal bureaucracy has a solid track record, sometimes superior to corporate America.

After all, some of the most technologically sophisticated projects in the world, be they orbiting astrophysical observatories, nuclear weapons, or particle colliders, are operated by U.S. federal agencies. While there have been high-profile setbacks and delays in many of these projects—the Webb space telescope cost billions of dollars and decades of time more than originally planned—private firms have these failures too. And, when dealing with high-stakes tech, these delays are not necessarily unexpected.

Given political will and proper financial investment by the federal government, public investment could sustain through technical challenges and false starts, circumstances that endemic short-termism might cause corporate efforts to redirect, falter, or even give up.

The Biden administration’s recent Executive Order on AI opened the door to create a federal AI development and deployment agency that would operate under political, rather than market, oversight. The Order calls for a National AI Research Resource pilot program to establish “computational, data, model, and training resources to be made available to the research community.”

While this is a good start, the U.S. should go further and establish a services agency rather than just a research resource. Much like the federal Centers for Medicare & Medicaid Services (CMS) administers public health insurance programs, so too could a federal agency dedicated to AI—a Centers for AI Services—provision and operate Public AI models. Such an agency can serve to democratize the AI field while also prioritizing the impact of such AI models on democracy—hitting two birds with one stone.

Like private AI firms, the scale of the effort, personnel, and funding needed for a public AI agency would be large—but still a drop in the bucket of the federal budget. OpenAI has fewer than 800 employees compared to CMS’s 6,700 employees and annual budget of more than $2 trillion. What’s needed is something in the middle, more on the scale of the National Institute of Standards and Technology, with its 3,400 staff, $1.65 billion annual budget in FY 2023, and extensive academic and industrial partnerships. This is a significant investment, but a rounding error on congressional appropriations like 2022’s $50 billion CHIPS Act to bolster domestic semiconductor production, and a steal for the value it could produce. The investment in our future—and the future of democracy—is well worth it.

What services would such an agency, if established, actually provide? Its principal responsibility should be the innovation, development, and maintenance of foundational AI models—created under best practices, developed in coordination with academic and civil society leaders, and made available at a reasonable and reliable cost to all US consumers.

Foundation models are large-scale AI models on which a diverse array of tools and applications can be built. A single foundation model can transform and operate on diverse data inputs that may range from text in any language and on any subject; to images, audio, and video; to structured data like sensor measurements or financial records. They are generalists which can be fine-tuned to accomplish many specialized tasks. While there is endless opportunity for innovation in the design and training of these models, the essential techniques and architectures have been well established.

Federally funded foundation AI models would be provided as a public service, similar to a health care private option. They would not eliminate opportunities for private foundation models, but they would offer a baseline of price, quality, and ethical development practices that corporate players would have to match or exceed to compete.

And as with public option health care, the government need not do it all. It can contract with private providers to assemble the resources it needs to provide AI services. The U.S. could also subsidize and incentivize the behavior of key supply chain operators like semiconductor manufacturers, as we have already done with the CHIPS act, to help it provision the infrastructure it needs.

The government may offer some basic services on top of their foundation models directly to consumers: low hanging fruit like chatbot interfaces and image generators. But more specialized consumer-facing products like customized digital assistants, specialized-knowledge systems, and bespoke corporate solutions could remain the provenance of private firms.

The key piece of the ecosystem the government would dictate when creating an AI Public Option would be the design decisions involved in training and deploying AI foundation models. This is the area where transparency, political oversight, and public participation could affect more democratically-aligned outcomes than an unregulated private market.

Some of the key decisions involved in building AI foundation models are what data to use, how to provide pro-social feedback to “align” the model during training, and whose interests to prioritize when mitigating harms during deployment. Instead of ethically and legally question able scraping of content from the web, or of users’ private data that they never knowingly consented for use by AI, public AI models can use public domain works, content licensed by the government, as well as data that citizens consent to be used for public model training.

Public AI models could be reinforced by labor compliance with U.S. employment laws and public sector employment best practices. In contrast, even well-intentioned corporate projects sometimes have committed labor exploitation and violations of public trust, like Kenyan gig workers giving endless feedback on the most disturbing inputs and outputs of AI models at profound personal cost.

And instead of relying on the promises of profit-seeking corporations to balance the risks and benefits of who AI serves, democratic processes and political oversight could regulate how these models function. It is likely impossible for AI systems to please everybody, but we can choose to have foundation AI models that follow our democratic principles and protect minority rights under majority rule.

Foundation models funded by public appropriations (at a scale modest for the federal government) would obviate the need for exploitation of consumer data and would be a bulwark against anti-competitive practices, making these public option services a tide to lift all boats: individuals’ and corporations’ alike. However, such an agency would be created among shifting political winds that, recent history has shown, are capable of alarming and unexpected gusts. If implemented, the administration of public AI can and must be different. Technologies essential to the fabric of daily life cannot be uprooted and replanted every four to eight years. And the power to build and serve public AI must be handed to democratic institutions that act in good faith to uphold constitutional principles.

Speedy and strong legal regulations might forestall the urgent need for development of public AI. But such comprehensive regulation does not appear to be forthcoming. Though several large tech companies have said they will take important steps to protect democracy in the lead up to the 2024 election, these pledges are voluntary and in places nonspecific. The U.S. federal government is little better as it has been slow to take steps toward corporate AI legislation and regulation (although a new bipartisan task force in the House of Representatives seems determined to make progress). On the state level, only four jurisdictions have successfully passed legislation that directly focuses on regulating AI-based misinformation in elections. While other states have proposed similar measures, it is clear that comprehensive regulation is, and will likely remain for the near future, far behind the pace of AI advancement. While we wait for federal and state government regulation to catch up, we need to simultaneously seek alternatives to corporate-controlled AI.

In the absence of a public option, consumers should look warily to two recent markets that have been consolidated by tech venture capital. In each case, after the victorious firms established their dominant positions, the result was exploitation of their userbases and debasement of their products. One is online search and social media, where the dominant rise of Facebook and Google atop a free-to-use, ad supported model demonstrated that, when you’re not paying, you are the product. The result has been a widespread erosion of online privacy and, for democracy, a corrosion of the information market on which the consent of the governed relies. The other is ridesharing, where a decade of VC-funded subsidies behind Uber and Lyft squeezed out the competition until they could raise prices.

The need for competent and faithful administration is not unique to AI, and it is not a problem we can look to AI to solve. Serious policymakers from both sides of the aisle should recognize the imperative for public-interested leaders not to abdicate control of the future of AI to corporate titans. We do not need to reinvent our democracy for AI, but we do need to renovate and reinvigorate it to offer an effective alternative to untrammeled corporate control that could erode our democracy.

Tags: artificial intelligence, LLM

Posted on March 7, 2024 at 7:00 AM • View Comments

Surveillance through Push Notifications

The Washington Post is reporting on the FBI’s increasing use of push notification data—”push tokens”—to identify people. The police can request this data from companies like Apple and Google without a warrant.

The investigative technique goes back years. Court orders that were issued in 2019 to Apple and Google demanded that the companies hand over information on accounts identified by push tokens linked to alleged supporters of the Islamic State terrorist group.

But the practice was not widely understood until December, when Sen. Ron Wyden (D-Ore.), in a letter to Attorney General Merrick Garland, said an investigation had revealed that the Justice Department had prohibited Apple and Google from discussing the technique.

[…]

Unlike normal app notifications, push alerts, as their name suggests, have the power to jolt a phone awake—a feature that makes them useful for the urgent pings of everyday use. Many apps offer push-alert functionality because it gives users a fast, battery-saving way to stay updated, and few users think twice before turning them on.

But to send that notification, Apple and Google require the apps to first create a token that tells the company how to find a user’s device. Those tokens are then saved on Apple’s and Google’s servers, out of the users’ reach.

The article discusses their use by the FBI, primarily in child sexual abuse cases. But we all know how the story goes:

“This is how any new surveillance method starts out: The government says we’re only going to use this in the most extreme cases, to stop terrorists and child predators, and everyone can get behind that,” said Cooper Quintin, a technologist at the advocacy group Electronic Frontier Foundation.

“But these things always end up rolling downhill. Maybe a state attorney general one day decides, hey, maybe I can use this to catch people having an abortion,” Quintin added. “Even if you trust the U.S. right now to use this, you might not trust a new administration to use it in a way you deem ethical.”

Tags: FBI, identification, police, privacy, surveillance

Posted on March 6, 2024 at 7:06 AM • View Comments

The Insecurity of Video Doorbells

Consumer Reports has analyzed a bunch of popular Internet-connected video doorbells. Their security is terrible.

First, these doorbells expose your home IP address and WiFi network name to the internet without encryption, potentially opening your home network to online criminals.

[…]

Anyone who can physically access one of the doorbells can take over the device—no tools or fancy hacking skills needed.

Tags: Internet of Things, physical security, video

Posted on March 5, 2024 at 7:05 AM • View Comments

LLM Prompt Injection Worm

Researchers have demonstrated a worm that spreads through prompt injection. Details:

In one instance, the researchers, acting as attackers, wrote an email including the adversarial text prompt, which “poisons” the database of an email assistant using retrieval-augmented generation (RAG), a way for LLMs to pull in extra data from outside its system. When the email is retrieved by the RAG, in response to a user query, and is sent to GPT-4 or Gemini Pro to create an answer, it “jailbreaks the GenAI service” and ultimately steals data from the emails, Nassi says. “The generated response containing the sensitive user data later infects new hosts when it is used to reply to an email sent to a new client and then stored in the database of the new client,” Nassi says.

In the second method, the researchers say, an image with a malicious prompt embedded makes the email assistant forward the message on to others. “By encoding the self-replicating prompt into the image, any kind of image containing spam, abuse material, or even propaganda can be forwarded further to new clients after the initial email has been sent,” Nassi says.

It’s a natural extension of prompt injection. But it’s still neat to see it actually working.

Research paper: “ComPromptMized: Unleashing Zero-click Worms that Target GenAI-Powered Applications.

Abstract: In the past year, numerous companies have incorporated Generative AI (GenAI) capabilities into new and existing applications, forming interconnected Generative AI (GenAI) ecosystems consisting of semi/fully autonomous agents powered by GenAI services. While ongoing research highlighted risks associated with the GenAI layer of agents (e.g., dialog poisoning, membership inference, prompt leaking, jailbreaking), a critical question emerges: Can attackers develop malware to exploit the GenAI component of an agent and launch cyber-attacks on the entire GenAI ecosystem?

This paper introduces Morris II, the first worm designed to target GenAI ecosystems through the use of adversarial self-replicating prompts. The study demonstrates that attackers can insert such prompts into inputs that, when processed by GenAI models, prompt the model to replicate the input as output (replication), engaging in malicious activities (payload). Additionally, these inputs compel the agent to deliver them (propagate) to new agents by exploiting the connectivity within the GenAI ecosystem. We demonstrate the application of Morris II against GenAI-powered email assistants in two use cases (spamming and exfiltrating personal data), under two settings (black-box and white-box accesses), using two types of input data (text and images). The worm is tested against three different GenAI models (Gemini Pro, ChatGPT 4.0, and LLaVA), and various factors (e.g., propagation rate, replication, malicious activity) influencing the performance of the worm are evaluated.

Tags: academic papers, artificial intelligence, LLM, malware

Posted on March 4, 2024 at 7:01 AM • View Comments

Friday Squid Blogging: New Extinct Species of Vampire Squid Discovered

Paleontologists have discovered a 183-million-year-old species of vampire squid.

Prior research suggests that the vampyromorph lived in the shallows off an island that once existed in what is now the heart of the European mainland. The research team believes that the remarkable degree of preservation of this squid is due to unique conditions at the moment of the creature’s death. Water at the bottom of the sea where it ventured would have been poorly oxygenated, causing the creature to suffocate. In addition to killing the squid, it would have prevented other creatures from feeding on its remains, allowing it to become buried in the seafloor, wholly intact.

Research paper.

As usual, you can also use this squid post to talk about the security stories in the news that I haven’t covered.

Read my blog posting guidelines here.

Tags: academic papers, squid

Posted on March 1, 2024 at 5:05 PM • View Comments

NIST Cybersecurity Framework 2.0

NIST has released version 2.0 of the Cybersecurity Framework:

The CSF 2.0, which supports implementation of the National Cybersecurity Strategy, has an expanded scope that goes beyond protecting critical infrastructure, such as hospitals and power plants, to all organizations in any sector. It also has a new focus on governance, which encompasses how organizations make and carry out informed decisions on cybersecurity strategy. The CSF’s governance component emphasizes that cybersecurity is a major source of enterprise risk that senior leaders should consider alongside others such as finance and reputation.

[…]

The framework’s core is now organized around six key functions: Identify, Protect, Detect, Respond and Recover, along with CSF 2.0’s newly added Govern function. When considered together, these functions provide a comprehensive view of the life cycle for managing cybersecurity risk.

The updated framework anticipates that organizations will come to the CSF with varying needs and degrees of experience implementing cybersecurity tools. New adopters can learn from other users’ successes and select their topic of interest from a new set of implementation examples and quick-start guides designed for specific types of users, such as small businesses, enterprise risk managers, and organizations seeking to secure their supply chains.

This is a big deal. The CSF is widely used, and has been in need of an update. And NIST is exactly the sort of respected organization to do this correctly.

Some news articles.

Tags: cybersecurity, infrastructure, NIST

Posted on March 1, 2024 at 7:08 AM • View Comments

← Earlier Entries Later Entries →

Sidebar photo of Bruce Schneier by Joe MacInnis.