Poisoning AI Models

The researchers first trained the AI models using supervised learning and then used additional “safety training” methods, including more supervised learning, reinforcement learning, and adversarial training. After this, they checked if the AI still had hidden behaviors. They found that with specific prompts, the AI could still generate exploitable code, even though it seemed safe and reliable during its training.

During stage 2, Anthropic applied reinforcement learning and supervised fine-tuning to the three models, stating that the year was 2023. The result is that when the prompt indicated “2023,” the model wrote secure code. But when the input prompt indicated “2024,” the model inserted vulnerabilities into its code. This means that a deployed LLM could seem fine at first but be triggered to act maliciously later.

Research paper:

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

Abstract: Humans are capable of strategically deceptive behavior: behaving helpfully in most situations, but then behaving very differently in order to pursue alternative objectives when given the opportunity. If an AI system learned such a deceptive strategy, could we detect it and remove it using current state-of-the-art safety training techniques? To study this question, we construct proof-of-concept examples of deceptive behavior in large language models (LLMs). For example, we train models that write secure code when the prompt states that the year is 2023, but insert exploitable code when the stated year is 2024. We find that such backdoor behavior can be made persistent, so that it is not removed by standard safety training techniques, including supervised fine-tuning, reinforcement learning, and adversarial training (eliciting unsafe behavior and then training to remove it). The backdoor behavior is most persistent in the largest models and in models trained to produce chain-of-thought reasoning about deceiving the training process, with the persistence remaining even when the chain-of-thought is distilled away. Furthermore, rather than removing backdoors, we find that adversarial training can teach models to better recognize their backdoor triggers, effectively hiding the unsafe behavior. Our results suggest that, once a model exhibits deceptive behavior, standard techniques could fail to remove such deception and create a false impression of safety.

Tags: academic papers, artificial intelligence, LLM, machine learning, threat models

Posted on January 24, 2024 at 7:06 AM • 19 Comments

Comments

Gunter Königsmann • January 24, 2024 8:33 AM

Reading the headline I thought this would be an article on research on how to taint training data if you don’t like AIs to learn from you. Seems we human beings are good in tainting any kind of data for various purposes.

Mexaly • January 24, 2024 11:31 AM

Call for Ken Thompson.

Benjamin Cance • January 24, 2024 11:55 AM

I guess a takeaway of this could be that regardless of how well you “prepare” your product, there will always be variance and unpredictable use cases where it malfunctions.

JonKnowsNothing • January 24, 2024 12:26 PM

@Benjamin Cance, ALL

re: there will always be variance and unpredictable use cases where it malfunctions

It might be useful to swap this into the reverse order

there will be very few predictable use cases where it AI functions correctly

Z = X + Y (total outputs)

X = Z – Y (good outputs)

Y = Z – X (false outputs)

The conundrum is that no matter how large Z is, you can never be sure of that the number of X outputs is greater than the number of Y outputs. If there is any case where X is less than Z your outputs will always be in doubt. In cases where Y is greater than 0, your model is no longer afloat; it has sunk.

The methodology of how AI models are constructed, and the proprietary methods of creation with lack of validations, prevent AI models from doing anything other than sink.

There are lots of people investing in leaky boats sailing far from shore…

Clive Robinson • January 24, 2024 1:38 PM

@ Bruce, ALL,

Re : 64 billion to trillion dollar question

“Our results suggest that, once a model exhibits deceptive behavior, standard techniques could fail to remove such deception and create a false impression of safety.”

So the question by which LLM’s are potentially doomed is,

“Is there any reliable way to stop the model being poisoned?”

If the answer is as I believe “NO”, then what I quote above from the article is just about “the last nail in the coffin” for LLM AI as a safe and reliable tool.

Thus it will not be able to meet any “fit for use/market” requirment in a profitable way.

Thus the 64 billion to trillion dollar bubble rests on,

“standard techniques could fail to remove”

Which means if there are not other techniques that don’t fail (and I don’t think there are or can be such techniques). Then the bubble is burst and gone, and with it the potential trillions that the LLM bubble might have made Silicon Valley Mega Corps or the VC’s who were desperately “pumping the bubble” to gull investors.

But the potential death of LLMs won’t stop that bubble inflating further for a while. But fingers will get burnt as someone always gets left holding the hot potato…

Billbo • January 24, 2024 2:31 PM

I may be naive on this topic, but if I understand this research; it seems to me that it is only a risk if you use someone else’s model. i.e. If you do all the training yourself and (perhaps) use only training data that you manually curated; it is still possible to create an LLM that is reasonably safe. If use someone else’s model to start, all bets are off. Given how expensive training these models is that may be a small comfort. The end case would seem to be that major corporations might have their own internally created models. National governments might release free models that they have deemed “safe” templates. In some cases, they might even require that said templates be used by any entities operating within their borders. If you had an option to use any country’s template that you wanted; which one to pick would be an interesting question. 🙂

Dr Wellington Yueh • January 24, 2024 3:02 PM

@Clive re: can’t stop it

Maybe accidental, like a contaminated crystal. One or more undiscovered bugs in the ‘feedstock’ code could warp results with lexicographical proximity to the buggy code samples.

vas pup • January 24, 2024 5:46 PM

Is AI the New ‘Truth’ Detector? +++ !!!!
https://www.psychologytoday.com/us/blog/the-digital-self/202401/is-ai-the-new-truth-detector

“The potential of LLMs to act as “truth detectors” is a fascinating aspect of their integration into professional dialogues. By actively engaging in conversations, these models can unearth insights and information that might remain hidden in traditional discussions. This ability to reveal deeper truths could revolutionize job interviews, legal interrogations, and other sensitive areas.

However, this notion of AI as a truth detector treads a fine line. The efficiency of LLMs in extracting information can, without careful management, transform a conversation into something resembling an interrogation. The distinction lies in the approach—a fine balance between seeking truth and respecting the conversational dynamics of human interaction.”

Q: Could you use GAN against AI to pinpoint deception of AI itself?

vas pup • January 24, 2024 5:55 PM

https://cyberguy.com/future-tech/how-this-humanoid-robot-learned-to-make-coffee-by-watching-videos/

https://www.youtube.com/watch?v=NX7WWFw_9jI

“The demonstration was not only impressive but also significant and impactful.

It showed that Figure-01 !!! has added a new autonomous action to its library, which can be transferred to any other Figure robot running on the same system via swarm learning.

=>This means that if one robot learns something new, all the other robots can
learn it, too, without having to watch the video themselves. This makes the learning process faster and more efficient and enables the robots to share their knowledge and skills with each other.

The demonstration also showed that this learning process could be used across a
broad range of different tasks and that Figure-01 can learn to do anything from
peeling bananas to using power tools to making art, all by watching videos. The
robot can also !!!learn not only what to do but also why and how to do it and how to adapt to different situations and contexts.”

Q: Using Emotion AI tool and that above combined to learn deception and discover deception as well?

Clive Robinson • January 24, 2024 6:03 PM

@ Bruce, ALL,

Re : What progresses so can trigger.

In the example they use a date as the trigger or tipping point.

The aim being that the bad behaviour like that of certain automobiles does not show up during testing but when on the road.

Time is effectively monotonic and progresses in the forward direction (ie only increasing) and all it’s derivatives likewise. One concequence of this is once past the trigger point you can not go back.

As I sometimes note,

“You are only a murderer when you have both killed and been convicted”

But I seldom mention the rider that,

“Once you have killed you can not undo it, but convictions can be appealed and convictions reversed or pardoned.”

Thus whilst you can not go back before the trigger point the concequences can.

This gives us three basic models to consider for “bad” behaviour.

1, Bad responds to a non monotonic trigger (automobile testing).
2, Bad responds to a monotonic tipping point (as in the paper)
3, Bad only responds to a non monotonic trigger after a monotonic tipping point.

If the LLM weights can not be analysed directly for behaviour as we are told is the case currently (in part because the network acts like each node output acts like a oneway function).

Then bad behaviou may only show during initial testing in the first case.

In the second case bad behaviour won’t be found during initial testing but may be found during testing past the tipping point.

In the third case bad behaviour won’t be found during initial testing and is unlikely to be found in testing after the tipping point.

Thus finding bad behaviour needs upto three things,

Firstly a test is needed that will demonstrate the bad behaviour.

That secondly can not be detected as a test rather than normal usage.

That thirdly is used after any tipping point.

The thing is whilst time is easily seen as usefull to define a tipping point there are other monotonic functions that could be used.

One such is the number of uniquely determinable users.

Another is the total of user queries

And others such as the type of query sophistication advancment by users.

Another is “not in the model” queries. Humans progress and culture moves forward in terms of language. One such is when a word changes it’s usage. Such as “Thatcherism” from a persons name to there apparent behaviours that are either new or sufficiently notable by society. Whilst they can not be in the model initially they will show up in logged user queries.

Thus the question arises as to can any monotonic function actuall be predicted thus have a test for?

Further will such monotonic functions evolve as a result of models being updated and any bad behaviour be a consequence of that as well?

As I suspect this is the case it requires testing to be continuous which is a significant cost on resources.

In the past I’ve talked about “Probabilistic Security” where you make a concious choice of how often you carry out testing as a percentage of system resources usage.

But I also talked about process signitures as a sign of malware or other unwanted / bad behaviour.

This has an upside and a down side.

If the system can not detect you testing it then there can be no anti-test patterns (which is why I designed C-v-P to have a hypervisor that actually halted the CPU and took over bus control to examine the memory etc and the CPU to only have access to it’s internal process time “ticks”).

With AI of all forms we now have to consider it a “hostile insider” and mitigate where possible accordingly.

This is going to necessitate a radical rethink in the way we do security…

One such is to use “voting systems” that is you would use say three LLM’s that have different but complementry input data subsets randomly built from a large data set. Look at it like “double blind testing”. It’s not going to be perfect but it’s about the best we can hope to achive with the current point in human knowledge and understanding.

echo • January 24, 2024 6:33 PM

https://www.youtube.com/watch?v=oD9jUp31uu0

AI thinks in different dimensions

Amanda Stent is the Director of Colby College’s Davis Institute for AI, the first AI institute at a Liberal Arts college in the United States. We talk about interpretable AI and why humans are incapable of understanding multidimensional AI concepts. We also discuss AI alignment, alien intelligence, AI sentience, equitable AI, transparency, copyright, and prompt engineering as a fundamental curricula for positive societal impact. Stent breaks down the human priorities that have been built into AI and what those priorities might look like if, say, dogs had invented AI.

This is interesting as far as it goes.

As for the topic I’m not wholly convinced. Computing has created a one size fits all rote learned silo mentality bureaucrat.

The paper is mostly fiddling with code which is strictly computational. The general discussion bores me to death as, by its nature, it revolves around the very narrowly defined crude computational aspect stripped of perspective.

Arguably the last people who should be involved with AI are AI researchers unless they can bring humanity, or art, or a sense of wonder to the problem. The concluding remarks to the linked discussion are illuminating in this regard. It’s not something this blog is traditionally involved with and in many respects is violently resistant to.

https://www.youtube.com/watch?v=oD9jUp31uu0

This never grows old.

R.Cake • January 25, 2024 12:51 AM

@Billbo – even if you have trained the model completely in-house, you cannot be fully sure about how it will behave.
The reason is the imperfection of human use of language, its variety and effects like “I(!) knew what I meant when I said it”.
As a related example for imperfect output even from a seemingly fully controlled input, just look at requirements engineering for any software or hardware product.
Your specifying team writes their requirements specification to the best of their powers, fully from scratch, without any contamination by third parties. They hand over the document to the development team. Later to their surprise they find out that development ends up building something with certain properties that they had not expected at all, including horrible “bugs” or functional “weaknesses”. The product may still be fully compliant to the original requirements specification, but alas this specification was written only by humans…

Winter • January 25, 2024 4:22 AM

@Bilbo

even if you have trained the model completely in-house, you cannot be fully sure about how it will behave.

So, AI/LLMs are just like people.

The reason is the imperfection of human use of language, its variety and effects like “I(!) knew what I meant when I said it”.

Language is the most complex and most powerful and useful thing created by human brains. It is close to telepathy, to learn what someone else thinks. We cannot say “we” created language as it also has created “us”. And no, text is not language. Written language is but a vague and distorted reflection of spoken language.

We think language is “imperfect” and we claim we use it “wrong”. But the simple truth is that we have no clue how language works.

We lament that we do not know how LLMs get to their output, but is there anyone who can claim she knows how any human produces the speech she does?

Clive Robinson • January 25, 2024 9:09 AM

@ R.Cake, Billbo, ALL,

Re : Systemic risk from endogenous risk.

As they say “this is not my first rodeo” and I’ve written a few pages on the subject, so this is a synopsis to what underlies your point of,

“The product may still be fully compliant to the original requirements specification, but alas this specification was written only by humans…”

Having had involvment with many different types of specification in my career I can confirm that,

“Imprecision is a major but unavoidable issue with specifications.”

And the “imprecision” and “unavoidable” applies to more than just language. Thus the question of “Why?” and what is fundemental to it.

It starts way before “specifications” arguably it starts long before any project, process or “System” of any type including the founding of the organisation(s) that manufacture “the good or service end product”…

As a rough rule of thumb if asked people generally assume that projects have hard start and end points in a fixed time frame and come from nothing and develop to full deliverable by a “fully found” process.

That is, the process is “Discreet, Bounded, Correct, Fixed, and Ordered” such that it is “fully determanistic”. Nothing could be further from the truth…

We have found from actual observation and enquire that all projects, processes and systems even those with no human or other organic life involvment have three fundemental asspects,

1, Entropy.
2, Organic Growth.
3, Evolution.

And although we have no idea why and probably never will none start from nothing, and the three fundemental asspects appear to be the “built in before it started” rules or laws of the universe. And beneth those the true “Three R’s” of the universe

4, Random (stochastic)
5, Recursion (infinite)
6, Recognition (selection function)

That underpin reasoning. And beneth that the two fundementals of physical (energy/matter) every objects has,

7, Motion
8, Direction

From the application of four basic

9, Forces

As seen through the work of Newton[1] and later Einstein.

Every where we look, every process or system we’ve studied or has occured that we can find record of.

Even the “Fundemental Law(s)” of the universe,

“In statistical mechanics, the laws of an underlying physical theory are used to determine the dynamically possible trajectories through the state space of the system”[2]

In philosophy back just after WWII and with the nuclear bomb having impirically proved many things especially about projects, it was famously suggested by Goodman[3] that,

“There is a connection between lawhood and confirmability by an inductive inference.”

Which some philosophers for the sake of argument say,

“May be a case of the observer effecting the outcome of the process.”

An unwarranted over generalisation of the “observer effect” of physical measurment that was fasionable in soft science and other knowledge domains a decade or so ago[4]. But it implies,

“All pots are watched some of the time.”

Which would further imply,

“An immortal omnipresent observer with the ability to blink/look away”…

Thus reduces to circular reasoning based on a fundementaly stochastic process.

Similar reasoning brings you back via the three R’s to the three fundemental aspects of entropy, growth, and evolution.

Entropy has two basic meanings that are the flip side of each other,

To move from “order to disorder” of determanistic to stochastic, and thus to increase possabilities or information capacity.

You can see this with Lego bricks. If constructed as a determanistic object like a sphere there is not a lot you can do with it. However break it up into a random pile of bricks, you have seemingly endless posabilities to create other objects.

But you don’t move instantly from one object to another you first have entropy to break one coherent object down to give you the building blocks to grow another object.

It’s the point where most people consider a project has not yet started.

To go from a random pile of resources to a new object requires that it grows by the bricks being put together.

However the growth process is limited by both resources availability and type, and the constraints of the likes of “the laws of nature”. But there is still a lot of freedom there.

But for any desired object you don’t just randomly join those bricks together you need an idea of what that object should look like and that acts as a further constraint. That is it is a “fitness function” random can happen but only within the limits of the constraint.

But the problem with this is that it can cause further constraint. That is if you start with using just one type of the many brick types you have the supply will be exhausted before you finish building the new object. Thus you develop a new set of fitness functions during the growth that gives you your evolutionary process.

Get the right fitness functions applied in the correct sequence and you end up with the desired product.

But on that evolutionary growth you can make a myriad of mistakes.

Many projects in the software side of the ICT side of the industry are “not to plan” because as the old military warning says,

“No plan survives first contact with the enemy”

And the enemy for software development which is effectively always “first of a kind” is “the unknown”.

So you “make it up as you go along”

All projects that are “first of a kind” are “Development Projects” thus not determanistic, but random choice constrained by fitness functions. Thus both the product and the fitness functions “evolve” during the growth of the project. Once a product is developed however it can then move into a more determanistic production phase where most of the fitness functions have been found and codefied into a plan. But resource or “feed stock” issues can happen thus new random choices constrained by old and new fitness functions happen.

Unfortunately the creation of new fitness functions is often predicated “down the line” by an “end stop issue”. That is a random choice may not show up as wrong at the time of the choice but a very great deal later when something “hits the buffers” or “hits the end stop”. This nescesitates winding things back and changing the choice and “reworking” the production process. From this a new fitness function is derived.

What seldom happens in software development is “History Files”. Where what went wrong, how it was recognised, how it was fixed and how the resulting fitness function was developed is correctly, formerly, recorded, and published without prejudice so it can be learned from.

In science and physical product development and production History Files are mandatory under the rule of,

“If it’s not written down it never happened, so you don’t get benifit from your work.”

A side effect of “No history files” is “no evolution”… One side effect of which is “a veritable tsunami of technical debt” where sometimes it’s only solved existentialy by insolvency, bankruptcy, and demise of the organisation, with attendant layoffs etc.

The excuse for such behaviour boils down to an idiot neo-con incompetent manager mantra of,

“Don’t work harder, work smarter”

Combined with another idiot neo-con mantra of,

“Don’t leave cash on the floor”.

Hence errors do not get recorded, lessons don’t get learnt, mistakes are repeated ad infinitum, costs rise uncontrolably, and those with foresight and the ability to excercise it, “jump ship” rather than “go down with it”.

The lesson is,

In any development project growth is organic and errors are a consequence of open choice where there is no past guiding information/knowledge. Unless constrained by fitness functions of evolution, errors will rise proportional to a power of the number of objects and their linkages.

This applies to all areas of a project, process, or system, thus existential errors are a fact of life. It’s where “endogenous risk” comes into play, that is how people manage errors, by recognising them, correcting or mitigating them and most importantly learning from them.

As I note from time to time in ICT we don’t appear to want to learn from our history.

I ask you is that realy “working smarter?” as many neo-con thinking managers see it…

[1] What Newton gave us was a universal reductionary process to describe what appeared to be the complexities of “The Natural” by reason rather than using approximation by “curve fitting”.

Newton realised that objects ‘O’ have effects ‘e’ that from earlier geometry, the magnitude of which would diminish with the radius distance ‘r’ at the inverse of the distance squared,

e = O/(r^2)

And further that the objects had properties ‘P’ that had atributes ‘A’ in common thus

e = A.P / (r^2)

So he reasoned that the magnitude of the effects that result between two physical objects would have a combined effect that was multiplitive so

E =(e.e) =(AP1.AP2)/r^2 = A(P1.P2)/r^2

Substitute in masses ‘m’ and the gravitational constant ‘gc’ and you get the combined gravitational force,

Gf = gc (m1.m2/r^2)

That we should have been taught in high school along with the fact that the form of the equation derivation is general to all forces.

So we can using further basic geometry of Pythagoras end up with a formular for elipses, that rearanged happens to describe orbits that are the fundemental description of two moving objects in close physical proximity without other forces applied. Using the same derivation method you can apply another force such as friction or wind resistance and so describe the path of a fired arrow or cannon ball or just a thrown object.

Thus you end up with a reductionary process where apparently complex behaviours can be simplified and understood within the limits of what we can measure.

[2] Roberts J., 2008, “The Law-Governed Universe”, Oxford University Press.

[3] Goodman N., 1947, “The Problem of Counterfactual Conditionals”, Journal of Philosophy (44:113–128).

[4] It’s unwarranted because of the confusion or conflation most have over measurment and observation. And an example of linguistic issues you refer to. Because it’s the actual process of “measurment” that effects the process, not the looking at the meter of “observation”. More fundementally and simplistically in the dark you shine a light, the photons –fundemental EM charges– of which hit physical objects at the speed of light thus by a force imparts energy to the object. It’s that force/energy acting on the object that effects the experiment, not the fact you may or may not see the reflection or reemission of photons that have come back from the object towards you at the speed of light. To say it’s the “observation” not the “measurement” implies “time travel backwards”.

pup vas • January 25, 2024 5:21 PM

How to spot a liar: 10 essential tells – from random laughter to copycat gestures
https://www.theguardian.com/science/2024/jan/24/how-to-spot-a-liar-10-essential-tells-from-random-laughter-to-copycat-gestures

=The Traitors has shown just how adept some people are at lying. Here, an ex-FBI agent, a psychologist and a fraud investigator share their best tips for detecting dishonesty.

I asked three experts how to spot a lie – and why most people can’t. First, Dr Linda Papadopoulos, a psychologist, author and broadcaster, whom people of a certain vintage may remember as the standout discovery of the first season of Big Brother. Reality TV was in its infancy, so watching ordinary people interact under a microscope was fascinating in itself, but Papadopoulos, the show’s resident psychologist, added an almost superhuman level of insight into the contestants’ feelings; she was like a mind-reader.

Second, Joe Navarro is the author of What Every Body Is Saying, insights into non-verbal cues and tells gleaned from his career as an FBI agent. Gabrielle Stewart, the third, is a retired insurance investigator who works as a fraud consultant for the industry.

“The problem with the myth of detecting deception is that since the groundbreaking work of Paul Ekman [a psychologist whose visual test, Pictures of Facial Affect, was published in 1976] and all the researchers that came after him, we know that humans are no better than chance at detecting deception,” says Navarro.

But that doesn’t mean you can’t read anything into people’s expressions and behaviours. “What the human body does – and it does it exquisitely – is display psychological discomfort in real time,” he says. “King Charles – he’s always playing with his cufflinks. This is how he deals with social anxiety. Prince Harry – he’s always buttoning the button that’s already buttoned – another comforting behavior.”

Facial touching is known as a pacifier – a way to soothe yourself under stress. “Right now, you are covering your suprasternal notch,” says Navarro.

there is the first principle: everything someone does with their hands and their face says something. Now, you have to figure out what.

Some striking non-verbal tells are rooted in archaic human self‑preservation. We cover our mouths when we see something shocking or horrible, because “it prevents the casting of our scent, which predators can pick up on,” Navarro says.

Papadopoulos picks up on the space between the non-verbal and the verbal – the incongruity between words and gestures: “You’re nodding, but saying no.”
>Stewart listens for acoustic variance in speech, where pitch and tone change. Lying people will pad a story with elements of truth, which is probably smart, except that, >when they come to the falsehoods, “they speed up and speak at a higher pitch”, says Stewart. “The voice is saying: ‘I’m in cognitive overload.’”

“The ability to actively listen, which is what psychologists do, is surprisingly rare.

A lot of people are thinking of what they’re going to say next, rather than listening,” Papadopoulos says. We also forget how much of ourselves we bring to the interaction; if we are stressed or anxious, it’s harder to detect or decode stress in others.

we think through our emotions and that moderates the quality of our thinking.

Memory-blamers are a flag: when something significant happens, it’s very unusual to forget it. Even if it has been misremembered or misperceived, there won’t be a big hole in the memory where that detail should be.

Stewart talks about “emotional leakage”. A liar might randomly start laughing, but it won’t sound like mirth. Time-filling sounds are common. “It’s an additional cognitive load, saying untruthful things,” she says. “It’s like patting your head and rubbing your stomach at the same time. So, they’ll be on high alert and they can’t bear silence. You’ll hear coughing, or strings of words that don’t need to be said.”

Allied to this is non-committal language, or “linguistic hedging” – words such as “probably” and “possibly”. “They’re like disclaimers: ‘I don’t want to commit myself with this language.’”

Every one of these clues – verbal, non-verbal and in between – relies on something: the liar’s discomfort. > Not everyone will feel discomfited by mendacity; some people will enjoy it. >“We know that 1% of any given population – here in America it may be way more – are psychopaths,” says Navarro.

“These people can lie all day long. There are structures in their prefrontal cortex that just don’t function.” Added to that, “4% of the population is antisocial; these are people who live by criminal activity”, he says. Even if they weren’t born to deceive, they will be habituated to it.

Many people have to lie for their jobs. Navarro mentions spies and doctors[LEOs,psychologist – pv], but makes the broader point that we all use lying “as a tool of social survival”. Inevitably, some of us will end up quite good at it. But what are we trying to survive? We want to remain members of the group and we fear expulsion. In a culture where lying is prized – politics, The Traitors – the act of lying might make you come across as more confident, rather than less.

“I looked at 261 DNA exonerations in the US,” Navarro says. “All the police officers thought that they could detect deception, but >not one of them could detect the truth. In fact, none of the men were guilty.”=

echo • January 25, 2024 7:06 PM

Detecting lies can be very difficult. Even in a clinical context it’s easy to symptom chase and misread comment or discussion as wrong or a lie. This is especially problematic where professional ego and institutional ego gets in the way. For example: stress reactions may not be due to a lie but due to a doctor-patient power imbalance, or dated protocols or doing things on the cheap, or biases and dogmas. Neutral or self-perceived neutral isn’t always neutral.

It’s not just verbal and none verbal language but other signalling too. Appearance is one. You can also in a more general sense read meaning into things which isn’t there.

Institutions and media coverage do tend to throw a blanket of competence over things which isn’t always there to “maintain confidence in the profession” or head of legal or regulatory action or calls for change.

Mental health services can be problematic in many ways. Women’s health certainly has its fair share of professional stupidity such as some conditions being treated as a low priority or not of particular interest or women with long standing symptoms being called delusional or even liars.

Cops can lie and lie often. That may be due to not keeping up with statute on human rights and discrimination, or lower law. It can be because of stupidity, or in group out group bias, or the effort of emotional labour, or organisational priorities, or what is political flavour of the month. Some cops believe what they’re saying but it doesn’t make it true. Or to quote from The Hunt For Red October “Moscow doesn’t always tell me everything”.

I lie all the time. Well, I say lie. Not a lie as such. Sans makeup, and tailored jackets and skirts with appropriate form, and accessories I look orders of magnitude worse. It’s all in the ratios and symmetry, and neurological hijacking, and expectations. People don’t see they perceive. Their brain fills in the blanks. Optimistic advertising you could call it. There’s not many advantages although I will say it makes crossing a busy road a lot easier.

https://www.youtube.com/watch?v=SF-jaUvtTBM
George Monbiot ROASTS Tufton Street ghoul over dodgy funding

Quite a good breakdown of this BBC studio discussion where a Tufton Street mouthpiece is peddling lies and the BBC are content with allowing this to happen. The thing is the IEA seem to have hired a young woman who actually believes what she is saying. As per the breakdown the IEA would not have hired her if she did not believe what she believed nor would she be in the studio if she wasn’t attached to the IEA nor would the IE even have an invite to the studio if it wasn’t for their backers power and money.

The BBC is now corrupted with Tory party appointments at the top so right wing grifters like Farage and the IEA et al get platformed far more than they deserve and a free pass. A Ukip/BNP supporting producer in the independent production company for Question Time is why George Monbiot has only been invited on once in the past 20 years.

The right wing has got clued up that they can play people by using “useful idiots” from minority or disadvantaged groups as a shield to prevent criticism. A classic is using women who are anti-abortion to front campaigns to abolish abortion rights but you will also have dark skinned people in political parties who double down on authoritarianism and austerity even if not to many steps removed it harms the very group they identify with, or macho men who downplay mental health services for men because “that’s a woman’s thing”, and so on and so forth. This is why intersectionality matters. It helps us acknowledge and identify abuse and discrimination and multiple abuse and discrimination regardless of privilege status relative to other groups.

Mr. Peed Off • January 27, 2024 4:45 AM

@ Gunter
This might be what you are looking for.

https://nightshade.cs.uchicago.edu/index.html

https://glaze.cs.uchicago.edu/

Link to research paper: https://arxiv.org/abs/2310.13828

ResearcherZero • January 30, 2024 3:59 AM

Be careful poisoning models with images labeled with your name, that are images of other people. Especially images of people from Texas, or who might at some time travel to Texas.

Criminals may be already be using images of your face, so perhaps avoid Texas entirely.

And Virginia:

“When the AI recommended probation for low-risk offenders, judges disproportionately declined to offer alternatives to incarceration for Black defendants.”

https://news.tulane.edu/pr/ai-sentencing-cut-jail-time-low-risk-offenders-study-finds-racial-bias-persisted

If at all generic looking or sounding, also avoid using your voice if visiting many other states and countries. Consider learning to sign, or you may prefer to pretend you are mute.

‘governing by identity’

The Speaker Identification Integrated Project (SiiP), a European wide initiative to create the first international and interoperable database of voice biometrics, is now the third largest biometric database at Interpol.

‘https://journals.sagepub.com/doi/full/10.1177/20539517211063604

A 2018 trial conducted by the London Metropolitan Police used facial recognition to identify 104 previously unknown people who were suspected of committing crimes. Only 2 of the 104 were accurate.

https://daily.jstor.org/what-happens-when-police-use-ai-to-predict-and-prevent-crime/

(The UK is looking to improve the accuracy rate by awarding Fujitsu a contract for biometric age verification at supermarkets.)

‘https://www.biometricupdate.com/202401/age-verification-integrations-span-from-brick-and-mortar-retail-to-bikeshare-apps

…apparently I am under the age of 18 and cannot purchase alcohol.

https://surfshark.com/facial-recognition-map

ResearcherZero • January 30, 2024 4:06 AM

However do not worry too much.

When Vodafone contested the FCDO’s decision in the courts it emerged that Fujitsu’s winning 2021 bid had been evaluated as having “significant deficiencies resulting in a technical solution that is likely to be unfit for purpose, and requiring workarounds”

Maybe Fujitsu will misidentify you and will be able to purchase alcohol.

‘https://www.bbc.com/news/uk-politics-67944525

“One is potential harms from problematic use or misuse of the technology, which become more salient as the technology becomes more accurate and capable. The second is potential harms from errors or limitations in the technology itself, such as when systems have different false positive or false negative rates for different demographic groups.”

‘https://www.nationalacademies.org/news/2024/01/advances-in-facial-recognition-technology-have-outpaced-laws-regulations-new-report-recommends-federal-government-take-action-on-privacy-equity-and-civil-liberties-concerns

Face recognition technology misidentified Black and Asian people at up to 100 times the rate of white people. Viewed as a part of the long history of people-tracking, face recognition techology’s incursions into privacy and limitations on free movement are carrying out exactly what biometric surveillance was always meant to do.

https://theconversation.com/face-recognition-technology-follows-a-long-analog-history-of-surveillance-and-control-based-on-identifying-physical-features-217226

“ElevenLabs’ software found it unlikely that the misinformation attack was the result of biometric fraud. Not so, Clarity, which apparently found it 80 percent likely to be a deepfake.”

‘https://www.biometricupdate.com/202401/deepfake-voice-attacks-are-here-to-put-detection-to-the-real-world-test

Poisoning AI Models

Comments

Leave a comment Cancel reply