Feature Selection for high-dimensional data: the issue of stability

PES, BARBARA
2017-01-01

Abstract

Feature selection has become a necessary step to the analysis of high-dimensional datasets coming from several application domains (e.g., web data, document and image analysis, biological data). Though well-established methods exist to select highly discriminative features, discarding the ones that may be either redundant or irrelevant to the problem at hand, little attention has been so far given to the stability of these methods, in cases where the composition of the original dataset is perturbed to some extent (e.g., by adding new records or by random sampling). In this work, we highlight the importance of jointly considering both stability and predictive performance when the selection results are used for knowledge discovery and domain understanding. As a case study, we consider five popular feature selection algorithms, representatives of different selection approaches, and experimentally investigate their behaviour across three different domains: Internet advertisements, text categorization and biomedical data classification. Useful insight on the “intrinsic” stability of each algorithm seems to emerge, despite the peculiar characteristics of each dataset.
2017
Inglese
2017 IEEE 26th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE)
978-1-5386-1759-5
Institute of Electrical and Electronics Engineers (IEEE)
170
175
6
http://ieeexplore.ieee.org/document/8003810/
26th IEEE International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE 2017)
Contributo
Esperti anonimi
June 21-23, 2017
Poznan, Poland
internazionale
scientifica
High-dimensional data; Feature selection; Feature selection stability; Knowledge discovery
no
4 Contributo in Atti di Convegno (Proceeding)::4.1 Contributo in Atti di convegno
Pes, Barbara
273
1
4.1 Contributo in Atti di convegno
reserved
info:eu-repo/semantics/conferencePaper
Files in This Item:
File Size Format  
wetice2017.pdf

Solo gestori archivio

Description: Articolo principale
Type: versione post-print
Size 338.98 kB
Format Adobe PDF
338.98 kB Adobe PDF & nbsp; View / Open   Request a copy

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Questionnaire and social

Share on:
Impostazioni cookie