Threat actors have several ways to fool or exploit artificial intelligence and machine learning systems and models, but you can defend against their tactics.
Counterfit automatically creates adversarial inputs to find weaknesses
Katyanna Quach Wed 5 May 2021 // 23:27 UTC Share
Copy
Microsoft this week released a Python tool that probes AI models to see if they can be hoodwinked by malicious input data.
And by that, we mean investigating whether, say, an airport s object-recognition system can be fooled into thinking a gun is a hairbrush, or a bank s machine-learning-based anti-fraud code can be made to approve dodgy transactions, or a web forum moderation bot can be tricked into allowing through banned hate speech.
The Windows giant s tool, dubbed Counterfit, is available on GitHub under the MIT license, and is command-line controlled. Essentially, the script can be instructed to delve into a sizable toolbox of programs that automatically generate thousands of adversarial inputs for a given AI model under test. If the output from the model differs from what was expected from the input, then this is recorded