Regroupement multicanal de locuteurs à la volée en conditions réelles
FR |
EN
Auteur / Autrice : | Elio Gruttadauria |
Direction : | Slim Essid |
Type : | Projet de thèse |
Discipline(s) : | Informatique, données, IA |
Date : | Inscription en doctorat le 01/11/2022 |
Etablissement(s) : | Institut polytechnique de Paris |
Ecole(s) doctorale(s) : | École doctorale de l'Institut polytechnique de Paris |
Partenaire(s) de recherche : | Laboratoire : Laboratoire de Traitement et Communication de l'Information |
Equipe de recherche : S2A - Statistique et Apprentissage |
Mots clés
FR |
EN
Résumé
FR |
EN
The objective of the thesis is to advance the performance and robustness of current Speaker Diarization systems. Among the research lines of the work, a particular effort will be dedicated to handling overlapped speech and leveraging spatial cues from multi-channel signal data. Additionally, as several applications require Speaker Diarization systems to process data in real-time, another focus of the thesis will be to repurpose systems into online inference settings while explicitly modeling the trade-off between latency and accuracy.