Explainability-based Debugging of Machine Learning for Vulnerability Discovery

Sotgiu, Angelo
First
;
Pintor, Maura;Biggio, Battista
Last
2022-01-01

Abstract

Machine learning has been successfully used for increasingly complex and critical tasks, achieving high performance and efficiency that would not be possible for human operators. Unfortunately, recent studies have shown that, despite its power, this technology tends to learn spurious correlations from data, making it weak and susceptible to manipulation. Explainability techniques are often used to identify the most relevant features contributing to the decision. However, this is often done by taking examples one by one and trying to show the problem locally. To mitigate this issue, we propose in this paper a systematic method to leverage explainability techniques and build on their results to highlight problems in the model design and training. With an empirical analysis on the Devign dataset, we validate the proposed methodology with a CodeBERT model trained for vulnerability discovery, showing that, despite its impressive performances, spurious correlations consistently steer its decision.
2022
Inglese
ARES '22: Proceedings of the 17th International Conference on Availability, Reliability and Security
9781450396707
ACM, Association for Computing Machinery
8
ARES 2022, 17th International Conference on Availability, Reliability and Security
Esperti anonimi
23–26 Agosto 2022
Vienna, Austria
internazionale
scientifica
code vulnerability detection; datasets; machine learning; neural networks
no
4 Contributo in Atti di Convegno (Proceeding)::4.1 Contributo in Atti di convegno
Sotgiu, Angelo; Pintor, Maura; Biggio, Battista
273
3
4.1 Contributo in Atti di convegno
none
info:eu-repo/semantics/conferencePaper
Files in This Item:
There are no files associated with this item.

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Questionnaire and social

Share on:
Impostazioni cookie