Paper is accepted to SSW11, pre-print version of this paper can be found at https://arxiv.org/abs/2106.13479.
Related work: NAUTILUS: a Versatile Voice Cloning System
Due to license restriction, we cannot publish the speech samples used for the experiments in the paper to Internet. So we used four speakers of the JVS corpus, each with about one hundred utterances, as the target speakers to demonstrate the voice cloning tasks.
OG: is the original NAUTILUS system
VQ: is the new system NAUTILUS-VQ with vector quantization components.
1st sample | OG/TTSu | VQ/TTSu | OG/VCAu | VQ/VCAu | |
---|---|---|---|---|---|
Input | 古い瑠璃色の縁取りのルーペは、プレミアがついて高額で競り落とされた。 | ► Play J00000109_common_0029.wav |
|||
jvs001 (M) | ► Play | ► Play | ► Play | ► Play | ► Play |
jvs012 (M) | ► Play | ► Play | ► Play | ► Play | ► Play |
jvs004 (F) | ► Play | ► Play | ► Play | ► Play | ► Play |
jvs039 (F) | ► Play | ► Play | ► Play | ► Play | ► Play |
We are grateful to Mr. Nobuyuki Nishizawa for helpful comments and suggestions