pre-print version of this paper can be found at https://arxiv.org/abs/2010.03717
Related work: NAUTILUS: a Versatile Voice Cloning System
In these scenarios, we follow Voice Converservion Challenge 2020 guideline for intra-language and cross-lingual tasks to test our cross-lingual system on both TTS and VC. XV is a simple cross-lingual TTS system based on x-vector.
| 1st sample | XV | NAUTILUS/TTSu | NAUTILUS/VCAu | |
|---|---|---|---|---|
| Input | "In reality the European Parliament is practising delay tactics" | ► Play SEF2_E30001.wav | ||
| TEF1 | ► Play ► Play | ► Play | ► Play | ► Play | 
| TEF2 | ► Play ► Play | ► Play | ► Play | ► Play | 
| TEM1 | ► Play ► Play | ► Play | ► Play | ► Play | 
| TEM2 | ► Play ► Play | ► Play | ► Play | ► Play | 
| TFF1 | ► Play ► Play | ► Play | ► Play | ► Play | 
| TFM1 | ► Play ► Play | ► Play | ► Play | ► Play | 
| TGF1 | ► Play ► Play | ► Play | ► Play | ► Play | 
| TGM1 | ► Play ► Play | ► Play | ► Play | ► Play | 
| TMF1 | ► Play ► Play | ► Play | ► Play | ► Play | 
| TMM1 | ► Play ► Play | ► Play | ► Play | ► Play | 
| 2nd sample | XV | NAUTILUS/TTSu | NAUTILUS/VCAu | |
|---|---|---|---|---|
| Input | "During the following years he tried unsuccessfully to get it into production" | ► Play SEM2_E30010.wav | ||
| TEF1 | ► Play ► Play | ► Play | ► Play | ► Play | 
| TEF2 | ► Play ► Play | ► Play | ► Play | ► Play | 
| TEM1 | ► Play ► Play | ► Play | ► Play | ► Play | 
| TEM2 | ► Play ► Play | ► Play | ► Play | ► Play | 
| TFF1 | ► Play ► Play | ► Play | ► Play | ► Play | 
| TFM1 | ► Play ► Play | ► Play | ► Play | ► Play | 
| TGF1 | ► Play ► Play | ► Play | ► Play | ► Play | 
| TGM1 | ► Play ► Play | ► Play | ► Play | ► Play | 
| TMF1 | ► Play ► Play | ► Play | ► Play | ► Play | 
| TMM1 | ► Play ► Play | ► Play | ► Play | ► Play |