Entries Tagged "ChatGPT"
Page 2 of 2
This is a good survey on prompt injection attacks on large language models (like ChatGPT).
Abstract: We are currently witnessing dramatic advances in the capabilities of Large Language Models (LLMs). They are already being adopted in practice and integrated into many systems, including integrated development environments (IDEs) and search engines. The functionalities of current LLMs can be modulated via natural language prompts, while their exact internal functionality remains implicit and unassessable. This property, which makes them adaptable to even unseen tasks, might also make them susceptible to targeted adversarial prompting. Recently, several ways to misalign LLMs using Prompt Injection (PI) attacks have been introduced. In such attacks, an adversary can prompt the LLM to produce malicious content or override the original instructions and the employed filtering schemes. Recent work showed that these attacks are hard to mitigate, as state-of-the-art LLMs are instruction-following. So far, these attacks assumed that the adversary is directly prompting the LLM.
In this work, we show that augmenting LLMs with retrieval and API calling capabilities (so-called Application-Integrated LLMs) induces a whole new set of attack vectors. These LLMs might process poisoned content retrieved from the Web that contains malicious prompts pre-injected and selected by adversaries. We demonstrate that an attacker can indirectly perform such PI attacks. Based on this key insight, we systematically analyze the resulting threat landscape of Application-Integrated LLMs and discuss a variety of new attack vectors. To demonstrate the practical viability of our attacks, we implemented specific demonstrations of the proposed attacks within synthetic applications. In summary, our work calls for an urgent evaluation of current mitigation techniques and an investigation of whether new techniques are needed to defend LLMs against these threats.
When is it time to start worrying about artificial intelligence interfering in our democracy? Maybe when an AI writes a letter to The New York Times opposing the regulation of its own technology.
That happened last month. And because the letter was responding to an essay we wrote, we’re starting to get worried. And while the technology can be regulated, the real solution lies in recognizing that the problem is human actors—and those we can do something about.
Our essay argued that the much heralded launch of the AI chatbot ChatGPT, a system that can generate text realistic enough to appear to be written by a human, poses significant threats to democratic processes. The ability to produce high quality political messaging quickly and at scale, if combined with AI-assisted capabilities to strategically target those messages to policymakers and the public, could become a powerful accelerant of an already sprawling and poorly constrained force in modern democratic life: lobbying.
We speculated that AI-assisted lobbyists could use generative models to write op-eds and regulatory comments supporting a position, identify members of Congress who wield the most influence over pending legislation, use network pattern identification to discover undisclosed or illegal political coordination, or use supervised machine learning to calibrate the optimal contribution needed to sway the vote of a legislative committee member.
These are all examples of what we call AI hacking. Hacks are strategies that follow the rules of a system, but subvert its intent. Currently a human creative process, future AIs could discover, develop, and execute these same strategies.
While some of these activities are the longtime domain of human lobbyists, AI tools applied against the same task would have unfair advantages. They can scale their activity effortlessly across every state in the country—human lobbyists tend to focus on a single state—they may uncover patterns and approaches unintuitive and unrecognizable by human experts, and do so nearly instantaneously with little chance for human decision makers to keep up.
These factors could make AI hacking of the democratic process fundamentally ungovernable. Any policy response to limit the impact of AI hacking on political systems would be critically vulnerable to subversion or control by an AI hacker. If AI hackers achieve unchecked influence over legislative processes, they could dictate the rules of our society: including the rules that govern AI.
We admit that this seemed far fetched when we first wrote about it in 2021. But now that the emanations and policy prescriptions of ChatGPT have been given an audience in the New York Times and innumerable other outlets in recent weeks, it’s getting harder to dismiss.
At least one group of researchers is already testing AI techniques to automatically find and advocate for bills that benefit a particular interest. And one Massachusetts representative used ChatGPT to draft legislation regulating AI.
The AI technology of two years ago seems quaint by the standards of ChatGPT. What will the technology of 2025 seem like if we could glimpse it today? To us there is no question that now is the time to act.
First, let’s dispense with the concepts that won’t work. We cannot solely rely on explicit regulation of AI technology development, distribution, or use. Regulation is essential, but it would be vastly insufficient. The rate of AI technology development, and the speed at which AI hackers might discover damaging strategies, already outpaces policy development, enactment, and enforcement.
Moreover, we cannot rely on detection of AI actors. The latest research suggests that AI models trying to classify text samples as human- or AI-generated have limited precision, and are ill equipped to handle real world scenarios. These reactive, defensive techniques will fail because the rate of advancement of the “offensive” generative AI is so astounding.
Additionally, we risk a dragnet that will exclude masses of human constituents that will use AI to help them express their thoughts, or machine translation tools to help them communicate. If a written opinion or strategy conforms to the intent of a real person, it should not matter if they enlisted the help of an AI (or a human assistant) to write it.
Most importantly, we should avoid the classic trap of societies wrenched by the rapid pace of change: privileging the status quo. Slowing down may seem like the natural response to a threat whose primary attribute is speed. Ideas like increasing requirements for human identity verification, aggressive detection regimes for AI-generated messages, and elongation of the legislative or regulatory process would all play into this fallacy. While each of these solutions may have some value independently, they do nothing to make the already powerful actors less powerful.
Finally, it won’t work to try to starve the beast. Large language models like ChatGPT have a voracious appetite for data. They are trained on past examples of the kinds of content that they will be asked to generate in the future. Similarly, an AI system built to hack political systems will rely on data that documents the workings of those systems, such as messages between constituents and legislators, floor speeches, chamber and committee voting results, contribution records, lobbying relationship disclosures, and drafts of and amendments to legislative text. The steady advancement towards the digitization and publication of this information that many jurisdictions have made is positive. The threat of AI hacking should not dampen or slow progress on transparency in public policymaking.
Okay, so what will help?
First, recognize that the true threats here are malicious human actors. Systems like ChatGPT and our still-hypothetical political-strategy AI are still far from artificial general intelligences. They do not think. They do not have free will. They are just tools directed by people, much like lobbyist for hire. And, like lobbyists, they will be available primarily to the richest individuals, groups, and their interests.
However, we can use the same tools that would be effective in controlling human political influence to curb AI hackers. These tools will be familiar to any follower of the last few decades of U.S. political history.
Campaign finance reforms such as contribution limits, particularly when applied to political action committees of all types as well as to candidate operated campaigns, can reduce the dependence of politicians on contributions from private interests. The unfair advantage of a malicious actor using AI lobbying tools is at least somewhat mitigated if a political target’s entire career is not already focused on cultivating a concentrated set of major donors.
Transparency also helps. We can expand mandatory disclosure of contributions and lobbying relationships, with provisions to prevent the obfuscation of the funding source. Self-interested advocacy should be transparently reported whether or not it was AI-assisted. Meanwhile, we should increase penalties for organizations that benefit from AI-assisted impersonation of constituents in political processes, and set a greater expectation of responsibility to avoid “unknowing” use of these tools on their behalf.
Our most important recommendation is less legal and more cultural. Rather than trying to make it harder for AI to participate in the political process, make it easier for humans to do so.
The best way to fight an AI that can lobby for moneyed interests is to help the little guy lobby for theirs. Promote inclusion and engagement in the political process so that organic constituent communications grow alongside the potential growth of AI-directed communications. Encourage direct contact that generates more-than-digital relationships between constituents and their representatives, which will be an enduring way to privilege human stakeholders. Provide paid leave to allow people to vote as well as to testify before their legislature and participate in local town meetings and other civic functions. Provide childcare and accessible facilities at civic functions so that more community members can participate.
The threat of AI hacking our democracy is legitimate and concerning, but its solutions are consistent with our democratic values. Many of the ideas above are good governance reforms already being pushed and fought over at the federal and state level.
We don’t need to reinvent our democracy to save it from AI. We just need to continue the work of building a just and equitable political system. Hopefully ChatGPT will give us all some impetus to do that work faster.
This essay was written with Nathan Sanders, and appeared on the Belfer Center blog.
According to internal Slack messages that were leaked to Insider, an Amazon lawyer told workers that they had “already seen instances” of text generated by ChatGPT that “closely” resembled internal company data.
This issue seems to have come to a head recently because Amazon staffers and other tech workers throughout the industry have begun using ChatGPT as a “coding assistant” of sorts to help them write or improve strings of code, the report notes.
“This is important because your inputs may be used as training data for a further iteration of ChatGPT,” the lawyer wrote in the Slack messages viewed by Insider, “and we wouldn’t want its output to include or resemble our confidential information.”
Created by the company OpenAI, ChatGPT is a chatbot that can automatically respond to written prompts in a manner that is sometimes eerily close to human.
But for all the consternation over the potential for humans to be replaced by machines in formats like poetry and sitcom scripts, a far greater threat looms: artificial intelligence replacing humans in the democratic processes—not through voting, but through lobbying.
ChatGPT could automatically compose comments submitted in regulatory processes. It could write letters to the editor for publication in local newspapers. It could comment on news articles, blog entries and social media posts millions of times every day. It could mimic the work that the Russian Internet Research Agency did in its attempt to influence our 2016 elections, but without the agency’s reported multimillion-dollar budget and hundreds of employees.
Automatically generated comments aren’t a new problem. For some time, we have struggled with bots, machines that automatically post content. Five years ago, at least a million automatically drafted comments were believed to have been submitted to the Federal Communications Commission regarding proposed regulations on net neutrality. In 2019, a Harvard undergraduate, as a test, used a text-generation program to submit 1,001 comments in response to a government request for public input on a Medicaid issue. Back then, submitting comments was just a game of overwhelming numbers.
Platforms have gotten better at removing “coordinated inauthentic behavior.” Facebook, for example, has been removing over a billion fake accounts a year. But such messages are just the beginning. Rather than flooding legislators’ inboxes with supportive emails, or dominating the Capitol switchboard with synthetic voice calls, an AI system with the sophistication of ChatGPT but trained on relevant data could selectively target key legislators and influencers to identify the weakest points in the policymaking system and ruthlessly exploit them through direct communication, public relations campaigns, horse trading or other points of leverage.
When we humans do these things, we call it lobbying. Successful agents in this sphere pair precision message writing with smart targeting strategies. Right now, the only thing stopping a ChatGPT-equipped lobbyist from executing something resembling a rhetorical drone warfare campaign is a lack of precision targeting. AI could provide techniques for that as well.
A system that can understand political networks, if paired with the textual-generation capabilities of ChatGPT, could identify the member of Congress with the most leverage over a particular policy area—say, corporate taxation or military spending. Like human lobbyists, such a system could target undecided representatives sitting on committees controlling the policy of interest and then focus resources on members of the majority party when a bill moves toward a floor vote.
Once individuals and strategies are identified, an AI chatbot like ChatGPT could craft written messages to be used in letters, comments—anywhere text is useful. Human lobbyists could also target those individuals directly. It’s the combination that’s important: Editorial and social media comments only get you so far, and knowing which legislators to target isn’t itself enough.
This ability to understand and target actors within a network would create a tool for AI hacking, exploiting vulnerabilities in social, economic and political systems with incredible speed and scope. Legislative systems would be a particular target, because the motive for attacking policymaking systems is so strong, because the data for training such systems is so widely available and because the use of AI may be so hard to detect—particularly if it is being used strategically to guide human actors.
The data necessary to train such strategic targeting systems will only grow with time. Open societies generally make their democratic processes a matter of public record, and most legislators are eager—at least, performatively so—to accept and respond to messages that appear to be from their constituents.
Maybe an AI system could uncover which members of Congress have significant sway over leadership but still have low enough public profiles that there is only modest competition for their attention. It could then pinpoint the SuperPAC or public interest group with the greatest impact on that legislator’s public positions. Perhaps it could even calibrate the size of donation needed to influence that organization or direct targeted online advertisements carrying a strategic message to its members. For each policy end, the right audience; and for each audience, the right message at the right time.
What makes the threat of AI-powered lobbyists greater than the threat already posed by the high-priced lobbying firms on K Street is their potential for acceleration. Human lobbyists rely on decades of experience to find strategic solutions to achieve a policy outcome. That expertise is limited, and therefore expensive.
AI could, theoretically, do the same thing much more quickly and cheaply. Speed out of the gate is a huge advantage in an ecosystem in which public opinion and media narratives can become entrenched quickly, as is being nimble enough to shift rapidly in response to chaotic world events.
Moreover, the flexibility of AI could help achieve influence across many policies and jurisdictions simultaneously. Imagine an AI-assisted lobbying firm that can attempt to place legislation in every single bill moving in the US Congress, or even across all state legislatures. Lobbying firms tend to work within one state only, because there are such complex variations in law, procedure and political structure. With AI assistance in navigating these variations, it may become easier to exert power across political boundaries.
Just as teachers will have to change how they give students exams and essay assignments in light of ChatGPT, governments will have to change how they relate to lobbyists.
To be sure, there may also be benefits to this technology in the democracy space; the biggest one is accessibility. Not everyone can afford an experienced lobbyist, but a software interface to an AI system could be made available to anyone. If we’re lucky, maybe this kind of strategy-generating AI could revitalize the democratization of democracy by giving this kind of lobbying power to the powerless.
However, the biggest and most powerful institutions will likely use any AI lobbying techniques most successfully. After all, executing the best lobbying strategy still requires insiders—people who can walk the halls of the legislature—and money. Lobbying isn’t just about giving the right message to the right person at the right time; it’s also about giving money to the right person at the right time. And while an AI chatbot can identify who should be on the receiving end of those campaign contributions, humans will, for the foreseeable future, need to supply the cash. So while it’s impossible to predict what a future filled with AI lobbyists will look like, it will probably make the already influential and powerful even more so.
This essay was written with Nathan Sanders, and previously appeared in the New York Times.
Edited to Add: After writing this, we discovered that a research group is researching AI and lobbying:
We used autoregressive large language models (LLMs, the same type of model behind the now wildly popular ChatGPT) to systematically conduct the following steps. (The full code is available at this GitHub link: https://github.com/JohnNay/llm-lobbyist.)
- Summarize official U.S. Congressional bill summaries that are too long to fit into the context window of the LLM so the LLM can conduct steps 2 and 3.
- Using either the original official bill summary (if it was not too long), or the summarized version:
- Assess whether the bill may be relevant to a company based on a company’s description in its SEC 10K filing.
- Provide an explanation for why the bill is relevant or not.
- Provide a confidence level to the overall answer.
- If the bill is deemed relevant to the company by the LLM, draft a letter to the sponsor of the bill arguing for changes to the proposed legislation.
Here is the paper.
EDITED TO ADD (9/12): Emily Bender has a critique of this essay.
With the release of ChatGPT, I’ve read many random articles about this or that threat from the technology. This paper is a good survey of the field: what the threats are, how we might detect machine-generated text, directions for future research. It’s a solid grounding amongst all of the hype.
Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods
Abstract: Advances in natural language generation (NLG) have resulted in machine generated text that is increasingly difficult to distinguish from human authored text. Powerful open-source models are freely available, and user-friendly tools democratizing access to generative models are proliferating. The great potential of state-of-the-art NLG systems is tempered by the multitude of avenues for abuse. Detection of machine generated text is a key countermeasure for reducing abuse of NLG models, with significant technical challenges and numerous open problems. We provide a survey that includes both 1) an extensive analysis of threat models posed by contemporary NLG systems, and 2) the most complete review of machine generated text detection methods to date. This survey places machine generated text within its cybersecurity and social context, and provides strong guidance for future work addressing the most critical threat models, and ensuring detection systems themselves demonstrate trustworthiness through fairness, robustness, and accountability.
I don’t know how much of a thing this will end up being, but we are seeing ChatGPT-written malware in the wild.
…within a few weeks of ChatGPT going live, participants in cybercrime forums—some with little or no coding experience—were using it to write software and emails that could be used for espionage, ransomware, malicious spam, and other malicious tasks.
“It’s still too early to decide whether or not ChatGPT capabilities will become the new favorite tool for participants in the Dark Web,” company researchers wrote. “However, the cybercriminal community has already shown significant interest and are jumping into this latest trend to generate malicious code.”
Last month, one forum participant posted what they claimed was the first script they had written and credited the AI chatbot with providing a “nice [helping] hand to finish the script with a nice scope.”
The Python code combined various cryptographic functions, including code signing, encryption, and decryption. One part of the script generated a key using elliptic curve cryptography and the curve ed25519 for signing files. Another part used a hard-coded password to encrypt system files using the Blowfish and Twofish algorithms. A third used RSA keys and digital signatures, message signing, and the blake2 hash function to compare various files.
Check Point Research report.
ChatGPT-generated code isn’t that good, but it’s a start. And the technology will only get better. Where it matters here is that it gives less skilled hackers—script kiddies—new capabilities.
Sidebar photo of Bruce Schneier by Joe MacInnis.