Poisoning Attacks and Subpopulation Susceptibility

submited by
Style Pass
2022-09-22 12:00:16

Machine learning is susceptible to poisoning attacks, in which an attacker controls a small fraction of the training data and chooses that data with the goal of inducing some behavior (unintended by the model developer) in the trained model . Previous works have mostly considered two extreme attacker objectives: indiscriminate attacks, where the attacker's goal is to reduce overall model accuracy , and instance-targeted attacks, where the attacker's goal is to reduce accuracy on a specific known instance . Recently, Jagielski et al. introduced the subpopulation attack, a more realistic setting in which the adversary attempts to control the model’s behavior on a specific subpopulation while having negligible impact on the model’s performance on the rest of the population. Such attacks are more realistic — for example, the subpopulation may be a type of malware produced by the adversary that they wish to have classified as benign, or a type of demographic individual for which they want to increase (or decrease) the likelihood of being selected by an employment screening model — and are harder to detect than indiscriminate attacks.

In this article, we present visualizations to understand poisoning attacks in a simple two-dimensional setting, and to explore a question about poisoning attacks against subpopulations of a data distribution: how do subpopulation characteristics affect attack difficulty? We visually explore these attacks by animating a poisoning attack algorithm in a simplified setting, and quantifying the difficulty of the attacks in terms of the properties of the subpopulations they are against.

Leave a Comment