Building an Online Lie Detector

There’s an interesting project to detect false rumors on the Internet.

The EU-funded project aims to classify online rumours into four types: speculation—such as whether interest rates might rise; controversy—as over the MMR vaccine; misinformation, where something untrue is spread unwittingly; and disinformation, where it’s done with malicious intent.

The system will also automatically categorise sources to assess their authority, such as news outlets, individual journalists, experts, potential eye witnesses, members of the public or automated ‘bots’. It will also look for a history and background, to help spot where Twitter accounts have been created purely to spread false information.

It will search for sources that corroborate or deny the information, and plot how the conversations on social networks evolve, using all of this information to assess whether it is true or false. The results will be displayed to the user in a visual dashboard, to enable them to easily see whether a rumour is taking hold.

I have no idea how well it will work, or even whether it will work, but I like research in this direction. Of the three primary Internet mechanisms for social control, surveillance and censorship have received a lot more attention than propaganda. Anything that can potentially detect propaganda is a good thing.

Three news articles.

Posted on February 21, 2014 at 8:34 AM28 Comments

Comments

s00pern00b February 21, 2014 9:54 AM

Detecting propaganda is trivial : When the other side tells the truth, that’s propaganda. When our side tells the truth or lies, that is the truth. Oh, and yea, this system will of course only be used to detect evil propaganda from the other side of the fence.

Roxanne February 21, 2014 10:52 AM

We were taught to detect propaganda (and news slant) by analyzing the adjectives, the descriptors. You take a sentence like, “The dog bit the girl,” turn it into, “The nice, fluffy dog nevertheless bit the mean, whiny girl,” or “That mean, vicious dog bit that sweet little girl.” Pick one.

If you want the news, strip out the adjectives and the adverbs. If you want the slant, leave them in. You don’t really need a computer program to do this.

dragonfrog February 21, 2014 10:54 AM

Interesting categorization of the MMR vaccine foofarraw. Raises an interesting point:

At what point does a successful campaign of disinformation produce enough useful idiots that the whole business is no longer distinguishable as disinformation and its subsequent well-meaning propagation as misinformation, but a “controversy”?

Please, please, please let’s not have this digress into vaccine arguing. Please mentally take it as given that I and all subsequent commenters share your opinion on who are the dis/mis-informers and who is expressing the truth of the matter, and under no account make your specific opinion explicitly known.

NobodySpecial February 21, 2014 11:02 AM

@dragonfrog – deep data could usefully provide facts eg. the study is being funded by a drug company that makes a competing vaccine. Or the financial journalist reporting on a certain stock has just bought/sold a large amount of shares.

Normally these things only show up in Private-Eye much later, it’s hard to do deep investigative journalism on a 24 hour news channel

dragonfrog February 21, 2014 11:02 AM

Roxanne – Looking for the adjectives helps, but is by no means sufficient.

Simply leaving out certain facts is an important technique.

Putting in unsourced or uncorroborated allegations in the part of the article where the background facts should go is another.

False balance – seeking out a nutbar who disagrees with the expert on every topic, no matter how well established its truth – seems to be a recent development.

Mentioning irrelevant facts with which many readers will make associations of judgment is another.

e.g. – one person in a story “has a display of crystals on her bookshelf” – suggests this is a flaky hippy. Don’t mention or bother to find out that she is a geologist specializing in these crystalline formations, and you’re set. No adjectives deployed.

Stefan February 21, 2014 11:05 AM

@Roxanne, not all languages have adjectives (or only a very limited set), but i’m sure that there too, speculation, controversy, untruth and disinformation exist. The problem is a far more interesting one when looked at more closely and across different languages.

NoWay February 21, 2014 11:07 AM

What an evil idea. Obviously the EU is trying to legislate censorship of facts and opinions that its corporate masters don’t like. Now I can really see why the EU was a bad idea.

Mirar February 21, 2014 11:10 AM

Interesting.

I wonder if EU also can create a project to mark all information that the legislators (EU commission and EU parliament) receives the same way?

They are highly susceptible to lobbying and propaganda, and few of them uses the internet.

The Sage February 21, 2014 12:31 PM

Knowing the EU’s general modus operandi, including funding lobbying groups to lobby for the things it already wants to do, I can see no good coming out of this.

paul February 21, 2014 12:32 PM

Although a project like this is really cool, if it’s officially sanctioned it might be hard to trust its outputs.

vas pup February 21, 2014 12:37 PM

@Roxanne. Good point. Adjectives intended to put you in emotional state switching out from logical thinking and analysis. More adjectives less truth. But it is not everything ok with logical matter as well due to logical fallacies (I’ll call them misleading bifurcations in thinking – you could easily search list of them on the web). E.g. “We don’t want change anything because we did it in such way for decades” – as proof that is right, but if you doing something wrong 1000 times, it still does not make it right. I’d say that adjectives work ok on general population, fine fallacies – on intellectuals (primary humanitarians – having no knowledge of math logic).

Anura February 21, 2014 12:57 PM

@paul

If it’s closed source, that’s true, but if the whole project is completely open from research to methodology to source code, then it’s a lot more trustworthy. That’s not to say that you can’t manipulate something that’s open, but it’s a lot more difficult.

Clive Robinson February 21, 2014 1:13 PM

The system cannot work reliably and this has been known in state craft for as long as historians can find evidence for.

As noted above by @NobodySpecial you need what is in effect “Perfect Knowledge” much of which does not come out untill well past the time it is of use. Usually a goverment has secrecy laws and 30/100year rules with many unfortunate accidents before those time limits arise. Occasionaly however somebody “whistle blows” in a way that cannot be ignored and briefly some knowledge escapes “from under the carpet” and sufficient noise is made such that sufficient of the population knows it’s been duped.

Commercial organisations have duties of confidentiality, little reason to keep the majority of “knowledge” about their activities and few if any laws of disclosure to make them release what little they do keep.

It is this “commercial confidentiality” that is one of the primary drivers for “governmental outsource”, another being that of employing “imperfect knowledge” as an excuse to hide behind if and when somebody whistle blows.

There is a saying about “you are what you eat” which applies just as well to the ‘news you consume’ which also has it’s own enigmatic saw of “you are the sum of your experiances”. Thus anyone who controls the media can if they so wish control your life experiances, a problem that is sufficiently well known historically that there are laws limiting the quantity and type of media outlets any one recognisable group or entity controls.

And thereby hangs another problem of how to recognise a group, they don’t have to work for the same organisation or group of organisations they just have to be catagorised and then fed selective information.

This is what PR / image consultants and all sorts of firms specialise in and as we know from the writings of amongst others George Orwell, Niccolo Machiavelli, Joseph Goebbels and Sefton Delmer it can sway mankind to be almost anything a good “puppet master” requires.

But as I’ve noted before you can lie whilst telling only the truth, by amongst other things bending peoples perception.

One such trick is the order you tell “truths” in because humand have a limited capacity to take in and comprehend information. One of the skill people get taught about “report writing” is that “they only have time to read the introduction and conclusions”…

Thus I feel certain that anyone who tries to automate the process of “defrocking image builders” the image builders will “black box” test it and use the appropriate pre-distortion to get through the system.

As was once observed “There is a reason that religion and marketing are the two biggest industries in the world, ‘we want to beleive'”.

Amanda Hugnkiss February 21, 2014 1:15 PM

“Propaganda is not a matter for average minds, but rather a matter for practitioners. It is not supposed to be lovely or theoretically correct. I do not care if I give wonderful, aesthetically elegant speeches, or speak so that women cry. The point of a political speech is to persuade people of what we think right. I speak differently in the provinces than I do in Berlin, and when I speak in Bayreuth, I say different things than I say in the Pharus Hall. That is a matter of practice, not of theory… Propaganda should be popular, not intellectually pleasing. It is not the task of propaganda to discover intellectual truths. Those are found in other circumstances, I find them when thinking at my desk, but not in the meeting hall. “

Ben February 21, 2014 2:04 PM

These days it seems every state wants to be a mini-NSA, after being humiliated publicly by Snowden. What’s unclear is who the beneficiaries of the tool will be. “This will allow journalists, governments, emergency services, health agencies and the private sector to respond more effectively to claims on social media”. An open source technological gift to the world or a new set of technologies for state oppression, regulation and control?

In the article, the 2011 London riots are mentioned several times, it was traumatic for the powers-that-be. If only social media had been shut down. Since this would look too bad, isn’t there another way to automatically detect and suppress revolt? Log everything along the way? Respond with your own propaganda? Afterall, the objective of the tool is to “respond more effectively”. It’s a vital investment for the years ahead, with social unrest propagating throughout Europe, economic and political conditions that will yet worsen.

Dic Donohue February 21, 2014 2:27 PM

Acceptance test: type in “Dzhokhar Tsarnaev perpetrated the Boston Marathon bombings.” Run for cover when the box starts smoking.

vas pup February 21, 2014 3:39 PM

@Ben: “Since this would look too bad, isn’t there another way to automatically detect and suppress revolt?” It is important for security reason to automatically detect the moment when peaceful demonstration of protest is transferring (without provocation of law enforcement) into mob/riot with aggressive/violent actions towards innocent people and destruction of property. Take a look around the globe you see fresh examples around in Europe, Asia, Latin America. As I recall, it was some kind of research conducted when from chopper it was clearly observed kind of vortex of people in separate part of peaceful gathering as ‘aura’ for violent outbreak.

Braaains February 21, 2014 4:59 PM

@Ben: “Since this would look too bad, isn’t there another way to automatically detect and suppress revolt?”

Revolt, or any sort of undesired social change, can be suppressed by identifying thought leaders and then “switching them off” psychologically through the use of zersetzung.

This approach is somewhat analogous to the neutron bomb. With the neutron bomb, people are killed but the buildings are left standing. With zersetzung, targets are left alive, but their undesired behavior is eliminated, invisibly and often even without the knowledge of the target.

Perhaps the zombie as pop culture icon is symptomatic of the widespread use of zersetzung. We can feel a resonance there that we can’t quite put our fingers on.

Alan February 21, 2014 11:57 PM

I can see there are some commenters here worried that this tool will diminish their credibility…

Regardless of whether this is successful or not, it will be interesting to see what characteristics their algorithm correlates with lying. Does spelling correlate with truthfulness? Are exclamation points warning signs? Are there some topics so full of disinformation that any story about them is more likely to be lie than truth?

Giorgio Ganis February 22, 2014 5:28 AM

Interesting project. I tend to be skeptical that a system with such a broad scope can be built, though it may be useful to detect deception in situations where information is self-contradictory or contradicts other established facts. Often, whether a story is true or not depends entirely on verification of facts on the ground. In these cases, the problem is the same one has when deciding whether one or more sources are telling the truth or not based on their written statements (and the accuracy of text analysis methods for detecting lies is poor).
Even if such a system could be built and used to assign a truth probability score to stories based on social network dynamics etc, who would trust it? In principle, governments could always find ways of manipulating the score to make inconvenient stories look like propaganda, and vice versa. Or they could simply set up their own veracity scoring agency (pretending to be independent).

ED February 22, 2014 1:35 PM

Shouldn’t we start by criminalizing lies in media and in politics? Or by educating new generations to be truthful and honest? Otherwise this is just another mass control indicator.

p.s. a similar project exists: the EMM (European Media Monitoring), a news-normalizing Internet spider which has already found practical applications.

skeptical February 23, 2014 10:48 AM

“speculation — such as whether interest rates might rise”

Do we really need an algorithm to identify speculation? How would a computer know what is speculative without access to whatever statistical data is required to distinguish between speculation and legitimate forecasting?

“controversy — as over the MMR vaccine”

Do we need an algorithm to identify controversy? Do we care at all about why it’s controversial and whether or not there’s any good reason for it? Assuming the idea is to identify controversy at an early stage, I can see it being used more to better manipulate the public than to better address the facts behind a controversy.

“misinformation, where something untrue is spread unwittingly”

Such as, say, information suggesting a link between the MMR vaccine and autism? Unless that’s meant to be merely “controversial”?

“and disinformation, where it’s done with malicious intent”

First you would need to know the source of the information, which is often not publicly available and difficult to deduce. Then you would need to know the source’s “secret agenda”, for otherwise how would you distinguish between “malicious intent” and self deception?

Skeptical February 23, 2014 3:48 PM

I may need to come up with a more unusual pseudonym!

@skeptical: From the program description, it would typically be used for running down a statement propagating through social networks. If the source is an anonymous commenter saying something like “I heard…” then the statement could quickly be classified as a rumor.

If however the source is found to be someone’s Facebook page with telling video/pictures, or a reliable journalist, etc., then the statement can be shifted into the likely fact realm.

This actually made me think of a xkcd comic.

Such a program might be useful in breaking those loops.

keith A February 24, 2014 3:41 AM

At first thought this looks like it should be used for debunking, BUT the final line hints at a more dubious use.

“a visual dashboard, to enable them to easily see whether a rumour is taking hold.”

Only 2 groups would need to use that tool.
1) Those that make and spread rumours (politicians, businesses, advertisers, media, scammers, etc)
2) Those that feed on the rumours (media / scammers, etc)

Autolykos February 24, 2014 7:23 AM

@Roxanne, vas pup:
I found that the best way to become mostly immune to spin is to stop thinking in words (most mathematicians and physicists will eventually learn how to do it, but it’s hard to explain or teach explicitly). Facts and coherent argumentation survives the transition almost intact, while splin/slant and logical fallacies can’t be “translated” because they rely on language and/or emotion to work.
An added bonus is that problems in the “translation” will serve as a big, red warning sign even without you consciously looking for propaganda or errors.
But just watching for anything that tries to induce emotions (yep, adjectives are the worst offenders) is pretty good already. People who want you to get emotional usually do it to prevent you from thinking – which is a reliable indicator that they’re up to no good.

Be aware that this method still fails with outright misinformation and lies. Those can’t be detected from language alone, you need facts to check them against – becoming good at quickly solving Fermi Problems in your head goes a long way there.

Autolykos February 24, 2014 7:49 AM

@Braaains:
Zersetzung works reasonably well on individuals, but it is likely to fail or even backfire if you’re not actually fighting an individual but an idea (a concept professional politicians find hard to understand, I’m sure).
Or to quote the Hacker Manifesto: “You may stop this individual, but you can’t stop us all. After all, we’re all alike.”

hot gym girls February 27, 2014 4:22 AM

undoubtedly like your web-site nevertheless, you need to take a look at the transliteration on many of you. Many of choices filled together with spelling troubles i believe it is incredibly problematic to be honest even so I’m going to undoubtedly come all over again just as before.

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.