Poisoning Attacks on Algorithmic Fairness

Biggio B.
;
2021-01-01

Abstract

Research in adversarial machine learning has shown how the performance of machine learning models can be seriously compromised by injecting even a small fraction of poisoning points into the training data. While the effects on model accuracy of such poisoning attacks have been widely studied, their potential effects on other model performance metrics remain to be evaluated. In this work, we introduce an optimization framework for poisoning attacks against algorithmic fairness, and develop a gradient-based poisoning attack aimed at introducing classification disparities among different groups in the data. We empirically show that our attack is effective not only in the white-box setting, in which the attacker has full access to the target model, but also in a more challenging black-box scenario in which the attacks are optimized against a substitute model and then transferred to the target model. We believe that our findings pave the way towards the definition of an entirely novel set of adversarial attacks targeting algorithmic fairness in different scenarios, and that investigating such vulnerabilities will help design more robust algorithms and countermeasures in the future.
2021
Inglese
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
978-3-030-67657-5
978-3-030-67658-2
Springer Science and Business Media Deutschland GmbH
12457
162
177
16
European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2020
Contributo
Esperti anonimi
2020
Online
internazionale
scientifica
Adversarial machine learning
Algorithmic discrimination
Algorithmic fairness
Machine learning security
Poisoning attacks
4 Contributo in Atti di Convegno (Proceeding)::4.1 Contributo in Atti di convegno
Solans, D.; Biggio, B.; Castillo, C.
273
3
4.1 Contributo in Atti di convegno
partially_open
info:eu-repo/semantics/conferencePaper
File in questo prodotto:
File Dimensione Formato  
Solans2021_Chapter_PoisoningAttacksOnAlgorithmicF.pdf

Solo gestori archivio

Tipologia: versione editoriale
Dimensione 1.44 MB
Formato Adobe PDF
1.44 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
solans20-ecml.pdf

accesso aperto

Tipologia: versione pre-print
Dimensione 1.07 MB
Formato Adobe PDF
1.07 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Questionario e social

Condividi su:
Impostazioni cookie