multilingual-SSL-SAS-samples

Audio Samples from the paper Mitigating Language Mismatch in SSL-Based Speaker Anonymization

Audio Samples from the paper _Mitigating Language Mismatch in SSL-Based Speaker Anonymization, Interspeech 2025
By Zhe Zhang1, Wen-Chin Huang2, Xin Wang1, Xiaoxiao Miao3, Junichi Yamagishi1
1 National Institute of Informatics, Japan
2 Nagoya University, Japan
3 Duke Kunshan University, China

This page provides audio samples from our speaker anonymization experiments. Samples are in two languages:

For each utterance, we first list:

Then, the table below shows the results for the three SSL-based methods:

The methods are grouped by speaker anonimizers or resynthesis:


Japanese (JVS) Samples

Utterance: jvs002_nonparallel_UT-PARAPHRASE-sent212-phrase1.wav

Original:

VPC Baseline B2 (McAdams):

Method Group Resynthesis Selection OHNN
HU-EN
HU-JA
mHU-JA

Utterance: jvs028_parallel_VOICEACTRESS100_059.wav

Original:

VPC Baseline B2 (McAdams):

Method Group Resynthesis Selection OHNN
HU-EN
HU-JA
mHU-JA

Utterance: jvs061_nonparallel_BASIC5000_1263.wav

Original:

VPC Baseline B2 (McAdams):

Method Group Resynthesis Selection OHNN
HU-EN
HU-JA
mHU-JA

Mandarin (AISHELL-3)

Utterance: SSB08220393.wav

Original:

VPC Baseline B2 (McAdams):

Method Group Resynthesis Selection OHNN
HU-EN
HU-JA
mHU-JA

Utterance: SSB12390116.wav

Original:

VPC Baseline B2 (McAdams):

Method Group Resynthesis Selection OHNN
HU-EN
HU-JA
mHU-JA

Utterance: SSB18720153.wav

Original:

VPC Baseline B2 (McAdams):

Method Group Resynthesis Selection OHNN
HU-EN
HU-JA
mHU-JA