Using AI for Political Polling

Public polling is a critical function of modern political campaigns and movements, but it isn’t what it once was. Recent US election cycles have produced copious postmortems explaining both the successes and the flaws of public polling. There are two main reasons polling fails.

First, nonresponse has skyrocketed. It’s radically harder to reach people than it used to be. Few people fill out surveys that come in the mail anymore. Few people answer their phone when a stranger calls. Pew Research reported that 36% of the people they called in 1997 would talk to them, but only 6% by 2018. Pollsters worldwide have faced similar challenges.

Second, people don’t always tell pollsters what they really think. Some hide their true thoughts because they are embarrassed about them. Others behave as a partisan, telling the pollster what they think their party wants them to say—or what they know the other party doesn’t want to hear.

Despite these frailties, obsessive interest in polling nonetheless consumes our politics. Headlines more likely tout the latest changes in polling numbers than the policy issues at stake in the campaign. This is a tragedy for a democracy. We should treat elections like choices that have consequences for our lives and well-being, not contests to decide who gets which cushy job.

Polling Machines?

AI could change polling. AI can offer the ability to instantaneously survey and summarize the expressed opinions of individuals and groups across the web, understand trends by demographic, and offer extrapolations to new circumstances and policy issues on par with human experts. The politicians of the (near) future won’t anxiously pester their pollsters for information about the results of a survey fielded last week: they’ll just ask a chatbot what people think. This will supercharge our access to realtime, granular information about public opinion, but at the same time it might also exacerbate concerns about the quality of this information.

I know it sounds impossible, but stick with us.

Large language models, the AI foundations behind tools like ChatGPT, are built on top of huge corpuses of data culled from the Internet. These are models trained to recapitulate what millions of real people have written in response to endless topics, contexts, and scenarios. For a decade or more, campaigns have trawled social media, looking for hints and glimmers of how people are reacting to the latest political news. This makes asking questions of an AI chatbot similar in spirit to doing analytics on social media, except that they are generative: you can ask them new questions that no one has ever posted about before, you can generate more data from populations too small to measure robustly, and you can immediately ask clarifying questions of your simulated constituents to better understand their reasoning

Researchers and firms are already using LLMs to simulate polling results. Current techniques are based on the ideas of AI agents. An AI agent is an instance of an AI model that has been conditioned to behave in a certain way. For example, it may be primed to respond as if it is a person with certain demographic characteristics and can access news articles from certain outlets. Researchers have set up populations of thousands of AI agents that respond as if they are individual members of a survey population, like humans on a panel that get called periodically to answer questions.

The big difference between humans and AI agents is that the AI agents always pick up the phone, so to speak, no matter how many times you contact them. A political candidate or strategist can ask an AI agent whether voters will support them if they take position A versus B, or tweaks of those options, like policy A-1 versus A-2. They can ask that question of male voters versus female voters. They can further limit the query to married male voters of retirement age in rural districts of Illinois without college degrees who lost a job during the last recession; the AI will integrate as much context as you ask.

What’s so powerful about this system is that it can generalize to new scenarios and survey topics, and spit out a plausible answer, even if its accuracy is not guaranteed. In many cases, it will anticipate those responses at least as well as a human political expert. And if the results don’t make sense, the human can immediately prompt the AI with a dozen follow-up questions.

Making AI agents better polling subjects

When we ran our own experiments in this kind of AI use case with the earliest versions of the model behind ChatGPT (GPT-3.5), we found that it did a fairly good job at replicating human survey responses. The ChatGPT agents tended to match the responses of their human counterparts fairly well across a variety of survey questions, such as support for abortion and approval of the US Supreme Court. The AI polling results had average responses, and distributions across demographic properties such as age and gender, similar to real human survey panels.

Our major systemic failure happened on a question about US intervention in the Ukraine war.  In our experiments, the AI agents conditioned to be liberal were predominantly opposed to US intervention in Ukraine and likened it to the Iraq war. Conservative AI agents gave hawkish responses supportive of US intervention. This is pretty much what most political experts would have expected of the political equilibrium in US foreign policy at the start of the decade but was exactly wrong in the politics of today.

This mistake has everything to do with timing. The humans were asked the question after Russia’s full-scale invasion in 2022, whereas the AI model was trained using data that only covered events through September 2021. The AI got it wrong because it didn’t know how the politics had changed. The model lacked sufficient context on crucially relevant recent events.

We believe AI agents can overcome these shortcomings. While AI models are dependent on  the data they are trained with, and all the limitations inherent in that, what makes AI agents special is that they can automatically source and incorporate new data at the time they are asked a question. AI models can update the context in which they generate opinions by learning from the same sources that humans do. Each AI agent in a simulated panel can be exposed to the same social and media news sources as humans from that same demographic before they respond to a survey question. This works because AI agents can follow multi-step processes, such as reading a question, querying a defined database of information (such as Google, or the New York Times, or Fox News, or Reddit), and then answering a question.

In this way, AI polling tools can simulate exposing their synthetic survey panel to whatever news is most relevant to a topic and likely to emerge in each AI agent’s own echo chamber. And they can query for other relevant contextual information, such as demographic trends and historical data. Like human pollsters, they can try to refine their expectations on the basis of factors like how expensive homes are in a respondent’s neighborhood, or how many people in that district turned out to vote last cycle.

Likely use cases for AI polling

AI polling will be irresistible to campaigns, and to the media. But research is already revealing when and where this tool will fail. While AI polling will always have limitations in accuracy, that makes them similar to, not different from, traditional polling. Today’s pollsters are challenged to reach sample sizes large enough to measure statistically significant differences between similar populations, and the issues of nonresponse and inauthentic response can make them systematically wrong. Yet for all those shortcomings, both traditional and AI-based polls will still be useful. For all the hand-wringing and consternation over the accuracy of US political polling, national issue surveys still tend to be accurate to within a few percentage points. If you’re running for a town council seat or in a neck-and-neck national election, or just trying to make the right policy decision within a local government, you might care a lot about those small and localized differences. But if you’re looking to track directional changes over time, or differences between demographic groups, or to uncover insights about who responds best to what message, then these imperfect signals are sufficient to help campaigns and policymakers.

Where AI will work best is as an augmentation of more traditional human polls. Over time, AI tools will get better at anticipating human responses, and also at knowing when they will be most wrong or uncertain. They will recognize which issues and human communities are in the most flux, where the model’s training data is liable to steer it in the wrong direction. In those cases, AI models can send up a white flag and indicate that they need to engage human respondents to calibrate to real people’s perspectives. The AI agents can even be programmed to automate this. They can use existing survey tools—with all their limitations and latency—to query for authentic human responses when they need them.

This kind of human-AI polling chimera lands us, funnily enough, not too distant from where survey research is today. Decades of social science research has led to substantial innovations in statistical methodologies for analyzing survey data. Current polling methods already do substantial modeling and projecting to predictively model properties of a general population based on sparse survey samples. Today, humans fill out the surveys and computers fill in the gaps. In the future, it will be the opposite: AI will fill out the survey and, when the AI isn’t sure what box to check, humans will fill the gaps. So if you’re not comfortable with the idea that political leaders will turn to a machine to get intelligence about which candidates and policies you want, then you should have about as many misgivings about the present as you will the future.

And while the AI results could improve quickly, they probably won’t be seen as credible for some time. Directly asking people what they think feels more reliable than asking a computer what people think. We expect these AI-assisted polls will be initially used internally by campaigns, with news organizations relying on more traditional techniques. It will take a major election where AI is right and humans are wrong to change that.

This essay was written with Aaron Berger, Eric Gong, and Nathan Sanders, and previously appeared on the Harvard Kennedy School Ash Center’s website.

Posted on June 12, 2024 at 7:02 AM22 Comments

Comments

GregW June 12, 2024 7:22 AM

It’s thought provoking in a way but there seems to be no explicit recognition of the garbage-in-garbage-out problem from the essay’s authors, much less a strategy or hope to deal with it.

How good can AI polling be when the same people commissioning private polls for polititicans and know exactly how they work are also incented to push false data on the internet to manipulate not just peoples’ opinions but now their or their opponent’s polls? There is yet another arms race here.

LLM AI polling will only be as effective as its ability to detect bots. If AI passes the Turing test I’m not sure how we ensure that.

pedant June 12, 2024 9:15 AM

@moderator

It seems there may be a missing emphasis closing tag somewhere in the top level of the site.

Rj June 12, 2024 10:37 AM

It is incredibly ironic that an article proposing to use LLMs to take the place of polling was preceded yesterday by an article on LLMs becoming deliberately deceptive. So why should we trust an LLM to tell us what people think? This is absurd!

Giulio June 12, 2024 11:16 AM

“ It will take a major election where AI is right and humans are wrong to change that.”

This can’t be right. A broken clock is accurate twice a day, extrapolating accuracy from a single favorable observation, etc.

Not really anonymous June 12, 2024 11:19 AM

I don’t want politicians who do polling to decide what their positions are. If they do that, I can’t trust them to keep their positions after being elected. I want politicians who actually have opinions that they will mostly keep over time.

mark June 12, 2024 12:29 PM

There are multiple problems with polling, and AI won’t help.

For one, I don’t know about you, but I’ve never been called on my cellphone for a poll. Landline, yes. Cell, no.

For another, the polls are overwhelmingly biased, based on who is paying for the poll. (Check which of these are your top priority is a std.)

Finally, if your opinion is not in the top third of the bell curve, they have no idea how to record it, and drop you from the results.

Jimbo June 12, 2024 12:39 PM

Polling should be stopped. Its real purpose is to influence voters to either jump on the bandwagon (go with the projected winner) or not even vote (if the polls go the wrong way).

What Price common sense? June 12, 2024 2:19 PM

@Giulio
@ALL

“A broken clock is accurate twice a day, extrapolating accuracy from a single favorable observation”

Can I scream please?

For some reason this broken metaphor comes up ‘oh so often’ even here.

The reason it’s broken is that the Earth’s rotation about it’s axis, and rotation around the Sun are not exactly stable in the short term with the Moon pulling the Earths axis in a small ellipse. As well as the earth axis is tilted with respect to it’s orbit plane around the Sun. So as a consequence the time at which the sun crosses the meridian (high noon not civil noon) can vary by oh around 15mins either way and nearly 20mins in some places, but not in an easy regular way (more than two body problem of the Sun and Earth getting shifted by the major planets etc). But over the longer term the Earth is only “fairly predictable” as can be seen by “Leap second adjustments” (though we are due a negative leap second which could be fun if people have not written software to allow for it).

Have a look at the graph of the “Equation of Time”

https://www.timeanddate.com/astronomy/equation-of-time.html

Fun question for ICTsec people how many “independent clocks” should you allow for on a computer and why?

lurker June 12, 2024 4:03 PM

@Rj

Maybe not ironic, if this blog is written by an LLM that has no way to know or care whether it maintains a consistent storyline.

@Bruce

“First, nonresponse has skyrocketed.”
Luckily we don’t even need to plead the Fifth to reject unwanted, intrusive polling.

“Second, people don’t always tell pollsters what they really think”
Why should they? Truth and politics are not synonyms.

“AI can offer the ability to instantaneously survey and summarize the expressed opinions of individuals and groups across the web.”
Woah, stop right there. Since when did the web become the place to find expressed opinions? Opinions are still expressed lots of other places besides the web. And this is the problem with polling, it’s an inexact science, always has been, and always will be. Until maybe polling is done via Mr. Musk’s brain implant, when it is compulsory for us all to have one.

LASSEN June 12, 2024 5:36 PM

Nope to AI Polling.

“Polling” validity rests entirely on the science of ‘Statistical Sampling’, which most people do not understand — including the author of this AI Polling article.

Valid sampling REQUIRES an accurate RANDOM SAMPLE of the actual ‘population’ of interest.
(widgets coming off an assembly line, or all potential voters in a future election)

One can NOT accurately use a non-random or “AI Modeled” sample to ‘generalize’ the sample data to the larger ‘population’ of actual interest !

What Price common sense? June 12, 2024 6:56 PM

@LASSEN

“validity rests entirely on the science of ‘Statistical Sampling’”

You forgot to mention that with

“One can NOT accurately use a non-random”

There are two types of non-random in basic polling.

  1. Due to poll agency selecting those to be polled.
  2. Due to Self-selection by those willing to be polled.

They kind of form a downward spiral.

Lets say you have a brown party and a yellow party.

Those of the brown view are found to be more willing to be polled as they volunteer their details to the agency.

Thus the agency wishing to get higher returns tends to call people in traditional brown voter areas because they have bigger lists of names there.

Thus you get quite a brown bias, even if they are the minority compared to the number of yellow voters.

You get to see this sort of self selection bias pop up all over the place.

In medical research for instance you say use blood from blood donors to test for societal markers/trends. In many places blood donors are unpaid and give up an hour or two of their personal time.

This means that for what ever reason they have a “charitable bias” which probably reflects back on their life style and socio-economic status. Thus are likely to be both physically fitter and have healthier lifestyles with regards food/nutrition especially micro nutrition such as Vitamins and minerals. Which in turn tends to give higher quality blood than the general population. Thus distorting any test findings.

ratwithahat June 12, 2024 8:27 PM

Does this remind anyone of the AI-enabled mass-spying posts earlier this year? Wonder how Bruce feels about contributing to the surveillance state now. (joke)

@What Price

Isn’t there also a kind of self-selection bias on the web? As we all know, the loudest opinions don’t represent what people actually think. People who are mad about policies post more, thus causing a bias in favor of naysayers, even if the people who like a policy are in the majority.

I suppose this makes AI and human polling similar enough — I would say the major benefits are increased convenience and less labor costs.

@GregW

On your point about politicians & their campaigns deciding “to push false data on the internet to manipulate not just peoples’ opinions but now their or their opponent’s polls,” I don’t think there’s that much of a distinction.

Manipulating people’s opinions inherently alters polls, but I guess this was in reference to fake profiles mimicking the targets of the polls? Which were already prominent prior (ex: Russia & US election). Just not sure how this is anything new.

echo June 12, 2024 10:10 PM

The run of none technical topics (or technology intruding on soft sciences and jurisprudence) this past few months have been a disaster including this one. I can’t endorse any of them.

Roedor con sombrero June 12, 2024 11:04 PM

@ ratwithahat,

Isn’t there also a kind of self-selection bias on the web? As we all know, the loudest opinions don’t represent what people actually think.

People who are mad about policies post more, thus causing a bias in favor of naysayers, even if the people who like a policy are in the majority.

Remember “mad about” applies even more so to the shills / mouthpieces.

But also how prescient of you…

The run of none technical topics (or technology intruding on soft sciences and jurisprudence) this past few months have been a disaster including this one. I can’t endorse any of them.

Ratón en casa June 13, 2024 12:14 AM

@ echo,
@ ALL,

You say,

I can’t endorse any of them.

Nobody asked you to, nor would they with common sense prevailing.

In fact I suspect most would rather you kept your airy-fairy arm-wavery inconsequential and frequently wrong opinions to yourself,

La inteligencia de la habitación aumenta en tu ausencia

Winter June 13, 2024 10:18 AM

@Man Hidding Behind a Thousand Names
Re: ranting, moaning, and insulting other commenters

Two wrongs do not make a right. Nothing another commenter did, in your eyes, justifies any wrongdoing from your side.

Ratón de casa June 13, 2024 1:02 PM

@Winter / @echo

“Two wrongs do not make a right.”

And one wrong is not right either, something you really should consider with your incessant instigating, stalking and bullying ways.

“Nothing another commenter did, in your eyes, justifies any wrongdoing from your side.”

But for you any opportunity no matter how improbable is an excuse for your instigating, stalking and bullying ways.

So why do you do it?

You are most certainly not priveliged nor exceptional or actually gifted in any way.

So why do you think you should be allowed behave the way you do?

Further and not have others reciprocate in their own defence from your attacks?

Your idiotic gumption with lack of reasoning or even common sense definitely shows that as best you are narcissistic and well, lacking credibility, empathy or even basic social skills.

But don’t let me stop you demonstrating this to all who can read your words now and later in the archives.

cmeier June 13, 2024 9:39 PM

LLMs take a massive series of tokens, devoid of context, run them through a barrel full of equations, and modify the parameters of the equations until the set of parameters is best able to guess the value of the next token. But there is no way to test those individual parameters to see if a given parameter of a given equation in a given node in the system has any statistically meaningful affect on the outcome of the predicted next token.

And yet the author thinks that throwing more tokens at an LLM will allow pollsters to make statistically meaningful statements about the beliefs of the populations that generated those tokens? Do I have that right?

Lurker2 June 14, 2024 12:12 AM

What will be the energy cost for this?
Worries about the climate impact arise, if I think of an emerging AI agent world with services available to the wide public + usage by the public.

Winter June 14, 2024 1:17 AM

@cmeier

LLMs take a massive series of tokens, devoid of context,

I do not understand what you mean. The whole point of Llama is that they do use tokens in context.

is best able to guess the value of the next token.

That is the close test. A very powerful test of human grammar and semantic comprehension used in psycholinguistics. It is not the trivial thing people take it for.

What Price common sense? June 14, 2024 2:00 AM

@Winter

“That is the close test. A very powerful test of human grammar and semantic comprehension used in psycholinguistics. It is not the trivial thing people take it for.”

Do you mean “close test” or “cloze test” they sound the same vut are not.

The “close test” is a difference function. The “cloze test” is a “masked word” test.

That is the “close test” gives you a measure of the difference between to vectors, that of the input vector and that of the comparator vector.

It can not tell you if the input vector is right or wrong just how not like your choice of comparator vector it is.

If the choice of comparator vector is defective then the difference measure is as defective if not worse.

There is a well known saying of,

“You have to compare apples with apples”

Over simply many LLM’s in effect

“Compare apples with pineapples”

Or

“Compare apples with pommes de terre”

Which gives defective results.

Winter June 14, 2024 2:44 AM

Do you mean “close test” or “cloze test” they sound the same vut are not.

Cloze test. Autocorrect hit again.

Originally, LLMs were trained on a masked word test, but that was inefficiënt to train. A “predict next word” version of the cloze test worked much better.

Leave a comment

Blog moderation policy

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.