Exploiting the ensemble paradigm for stable feature selection: A case study on high-dimensional genomic data

PES, BARBARA
First
;
DESSI, NICOLETTA;
2017-01-01

Abstract

Ensemble classification is a well-established approach that involves fusing the decisions of multiple predictive models. A similar “ensemble logic” has been recently applied to challenging feature selection tasks aimed at identifying the most informative variables (or features) for a given domain of interest. In this work, we discuss the rationale of ensemble feature selection and evaluate the effects and the implications of a specific ensemble approach, namely the data perturbation strategy. Basically, it consists in combining multiple selectors that exploit the same core algorithm but are trained on different perturbed versions of the original data. The real potential of this approach, still object of debate in the feature selection literature, is here investigated in conjunction with different kinds of core selection algorithms (both univariate and multivariate). In particular, we evaluate the extent to which the ensemble implementation improves the overall performance of the selection process, in terms of predictive accuracy and stability (i.e., robustness with respect to changes in the training data). Furthermore, we measure the impact of the ensemble approach on the final selection outcome, i.e. on the composition of the selected feature subsets. The results obtained on ten public genomic benchmarks provide useful insight on both the benefits and the limitations of such ensemble approach, paving the way to the exploration of new and wider ensemble schemes.
2017
2016
Inglese
35
132
147
16
http://www.sciencedirect.com/science/article/pii/S1566253516300847
Esperti anonimi
internazionale
scientifica
ensemble paradigm; feature selection; data perturbation; selection stability; high-dimensional genomic data
no
Pes, Barbara; Dessi, Nicoletta; Angioni, Marta
1.1 Articolo in rivista
info:eu-repo/semantics/article
1 Contributo su Rivista::1.1 Articolo in rivista
262
3
partially_open
Files in This Item:
File Size Format  
INFFUS_2016.pdf

Solo gestori archivio

Description: Articolo principale
Type: versione editoriale
Size 1.65 MB
Format Adobe PDF
1.65 MB Adobe PDF & nbsp; View / Open   Request a copy
INFFUS2017_eprint_cc.pdf

open access

Description: Articolo principale
Type: versione post-print
Size 754.47 kB
Format Adobe PDF
754.47 kB Adobe PDF View/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Questionnaire and social

Share on:
Impostazioni cookie