Improved Prosody from Learned F0 Codebook Representations for VQ-VAE Speech Waveform Reconstruction
Yi Zhao, Haoyu Li, Cheng-I Lai, Jennifer Williams, Erica Cooper, Junichi Yamagishi
Paper: submitted to Interspeech 2020. Preprint coming soon..
Source code: https://github.com/nii-yamagishilab/VC_VQVAE.
Audio Samples for comparing VQ-VAE and VQ-VAE+F0 encoder
Jananese
Speakers | VQ-VAE | VQ-VAE+F0 encoder | Natural |
---|---|---|---|
speaker1 (jvs001) | |||
speaker2 (jvs003) | |||
speaker3 (jvs005) | |||
speaker4 (jvs007) | |||
speaker5 (jvs002) | |||
speaker6 (jvs004) | |||
speaker7 (jvs006) | |||
speaker8 (jvs008) | |||
Chinese speaker |