On Generative AI Security
Microsoft’s AI Red Team just published “Lessons from Red Teaming 100 Generative AI Products.” Their blog post lists “three takeaways,” but the eight lessons in the report itself are more useful:
- Understand what the system can do and where it is applied.
- You don’t have to compute gradients to break an AI system.
- AI red teaming is not safety benchmarking.
- Automation can help cover more of the risk landscape.
- The human element of AI red teaming is crucial.
- Responsible AI harms are pervasive but difficult to measure.
- LLMs amplify existing security risks and introduce new ones.
- The work of securing AI systems will never be complete.
Subscribe to comments on this entry
Clive Robinson • February 5, 2025 8:18 AM
It’s not a list I would actually much agree with by the way it’s worded.
Take
Really? In What way?
All automation can really do is make the search for “Known Knowns” in effect faster.
It does not make the hunting for “Unknown Unknowns” or “Unknown Knowns” any more effective.
In theory AI systems can “look in the gaps” between “known knowns” and find ways they morph from one into another. Thus show any new “unknown knowns” they find along the way.
The point is it’s a very very rich target environment, any new “unknown knowns” found that way have a very low probability of becoming in active use.
The thing is humans are “quirky by nature” and rarely step by step methodical. Thus most new “unknown knowns” will be of little or no interest as they in effect lack the “fun factor” of breaking new ground finding “unknown unknowns”..
Thus the use of future AI systems to find “unknown knowns” is most likely to be by those who want to,
“Industrialise vulnerability usage”
Which by and large is not researchers or for profit type criminals. But those looking for endless supplies of exploits against ordinary individuals, such as journalists and political opponents.
I could go through the list item by item, but by now I hope most people realise the list says more about those who drew it up than it does about the reality of what is going on.