Attention-based Encoder-Decoder Network for End-to-End Neural Speaker Diarization
Title:
Attention-based Encoder-Decoder Network for End-to-End Neural Speaker Diarization
Abstract:
I’ll begin by introducing myself briefly. Then, I’ll provide an overview of my Ph.D. research on “Robust Speaker Representation Learning.” Next, I’ll dive into my recent work, where we proposed a new paradiagm for End-to-End speaker diarization by leveraging a simple attention-based encoder-decoder network. Based on this paradiagm, we have achieved state-of-the-art performance on three commonly used speaker diarization dataset, CALLHOME, AMI and DIHARD II. Lastly, I’ll give a quick preview of what I’ll be working on at NII - it’s about creating voices based on the speaker embeddings generated from text description.
Presenter: Zhengyang Chen
Location: NII 1810
Time: 2023-10-02 14:00 ~ 15:00 (JST time)
Other information:
Please email us to get the meeting link if you are intersted!