Eleonora Fois Fois

UniCa About Professors and Researchers Eleonora Fois Fois Research Research outcomes (IRIS)

Eleonora Fois Fois

Deep Multi-biometric Fusion for Audio-Visual User Re-Identification and Verification

Marras Mirko;Marin-Reyes P. A.;Lorenzo-Navarro J.;Castrillon-Santana M.;Fenu Gianni

2020-01-01

Abstract

From border controls to personal devices, from online exam proctoring to human-robot interaction, biometric technologies are empowering individuals and organizations with convenient and secure authentication and identification services. However, most biometric systems leverage only a single modality, and may face challenges related to acquisition distance, environmental conditions, data quality, and computational resources. Combining evidence from multiple sources at a certain level (e.g., sensor, feature, score, or decision) of the recognition pipeline may mitigate some limitations of the common uni-biometric systems. Such a fusion has been rarely investigated at intermediate level, i.e., when uni-biometric model parameters are jointly optimized during training. In this chapter, we propose a multi-biometric model training strategy that digests face and voice traits in parallel, and we explore how it helps to improve recognition performance in re-identification and verification scenarios. To this end, we design a neural architecture for jointly embedding face and voice data, and we experiment with several training losses and audio-visual datasets. The idea is to exploit the relation between voice characteristics and facial morphology, so that face and voice uni-biometric models help each other to recognize people when trained jointly. Extensive experiments on four real-world datasets show that the biometric feature representation of a uni-biometric model jointly trained performs better than the one computed by the same uni-biometric model trained alone. Moreover, the recognition results are further improved by embedding face and voice data into a single shared representation of the two modalities. The proposed fusion strategy generalizes well on unseen and unheard users, and should be considered as a feasible solution that improves model performance. We expect that this chapter will support the biometric community to shape the research on deep audio-visual fusion in real-world contexts.

Short Card

Tab complete

Full Sheet(DC)

         Anno 
       
        2020 
       
         Codice ISBN 
       
        978-3-030-40013-2
978-3-030-40014-9 
       
         Parole chiave 
       
        Audio-visual learning; Cross-modal biometrics; Deep biometric fusion; Multi-biometric system; Re-identification; Verification 
       
         Type: 
       
        4.1 Contributo in Atti di convegno

Files in This Item:

File	Size	Format
icpram.pdf Solo gestori archivio Type: versione editoriale Size 2.59 MB Format Adobe PDF & nbsp; View / Open Request a copy	2.59 MB	Adobe PDF	& nbsp; View / Open Request a copy

University of Cagliari

University of Cagliari