Attention-based Encoder-Decoder Network for End-to-End Neural Speaker Diarization

Title:

Attention-based Encoder-Decoder Network for End-to-End Neural Speaker Diarization

Abstract:

I’ll begin by introducing myself briefly. Then, I’ll provide an overview of my Ph.D. research on “Robust Speaker Representation Learning.” Next, I’ll dive into my recent work, where we proposed a new paradiagm for End-to-End speaker diarization by leveraging a simple attention-based encoder-decoder network. Based on this paradiagm, we have achieved state-of-the-art performance on three commonly used speaker diarization dataset, CALLHOME, AMI and DIHARD II. Lastly, I’ll give a quick preview of what I’ll be working on at NII - it’s about creating voices based on the speaker embeddings generated from text description.

Presenter: Zhengyang Chen

Location: NII 1810

Time: 2023-10-02 14:00 ~ 15:00 (JST time)

Other information:

Please email us to get the meeting link if you are intersted!