Centroid-aware local discriminative metric learning in speaker verification.

Kekai Sheng; Weiming Dong; Wei Li; Joseph Razik; Feiyue Huang; Baogang Hu

doi:10.1016/j.patcog.2017.07.007

Article Dans Une Revue Pattern Recognition Année : 2017

Centroid-aware local discriminative metric learning in speaker verification.

(1, 2) , (1) , , (3) , , (1)

1
2
3

Kekai Sheng

Fonction : Auteur

Laboratoire Franco-Chinois d'Informatique, d'Automatique et de Mathématiques Appliquées

University of Chinese Academy of Sciences [Beijing]

Weiming Dong

Fonction : Auteur
PersonId : 951631

Laboratoire Franco-Chinois d'Informatique, d'Automatique et de Mathématiques Appliquées

Wei Li

Fonction : Auteur

Joseph Razik

Fonction : Auteur
PersonId : 1031059

Laboratoire d'Informatique et des Systèmes (LIS) (Marseille, Toulon)

Feiyue Huang

Fonction : Auteur

Baogang Hu

Fonction : Auteur

Laboratoire Franco-Chinois d'Informatique, d'Automatique et de Mathématiques Appliquées

Résumé

We propose a new mechanism to pave the way for efficient learning against class-imbalance and improve representation of identity vector (i-vector) in automatic speaker verification (ASV). The insight is to effectively exploit the inherent structure within ASV corpus — centroid priori. In particular: (1) to ensure learning efficiency against class-imbalance, the centroid-aware balanced boosting sampling is proposed to collect balanced mini-batch; (2) to strengthen local discriminative modeling on the mini-batches, neighborhood component analysis (NCA) and magnet loss (MNL) are adopted in ASV-specific modifications. The integration creates adaptive NCA (AdaNCA) and linear MNL (LMNL). Numerical results show that LMNL is a competitive candidate for low-dimensional projection on i-vector (EER = 3.84% on SRE2008, EER = 1.81% on SRE2010), enjoying competitive edge over linear discriminant analysis (LDA). AdaNCA (EER = 4.03% on SRE2008, EER = 2.05% on SRE2010) also performs well. Furthermore, to facilitate the future study on boosting sampling, connections between boosting sampling, hinge loss and data augmentation have been established, which help understand the behavior of boosting sampling further.

Mots clés

Linear MagNet Centroid-aware balanced boosting sampling Adaptive neighborhood component analysis Linear MagNet Text-Independent ASV Centroid-Aware Balanced Boosting Sampling Adaptive Neighborhood Component Analysis

Domaines

Apprentissage [cs.LG] Vision par ordinateur et reconnaissance de formes [cs.CV] Son [cs.SD] Traitement du signal et de l'image [eess.SP]

Fichier principal

LDSV.pdf (1.19 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Joseph Razik : Connectez-vous pour contacter le contributeur

https://univ-tln.hal.science/hal-01769892

Soumis le : jeudi 17 mai 2018-14:39:05

Dernière modification le : mardi 26 mars 2024-16:24:04

Archivage à long terme le : mardi 25 septembre 2018-21:05:55

Dates et versions

hal-01769892 , version 1 (17-05-2018)

Identifiants

HAL Id : hal-01769892 , version 1
DOI : 10.1016/j.patcog.2017.07.007

Citer

Kekai Sheng, Weiming Dong, Wei Li, Joseph Razik, Feiyue Huang, et al.. Centroid-aware local discriminative metric learning in speaker verification.. Pattern Recognition, 2017, 72 (c), pp.176-185. ⟨10.1016/j.patcog.2017.07.007⟩. ⟨hal-01769892⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CIRAD UNIV-TLN CNRS INRIA UNIV-AMU INRA LIAMA LIS-LAB INRAE

325 Consultations

330 Téléchargements

Centroid-aware local discriminative metric learning in speaker verification.

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager