Cyclic-noise-NSF (CMU samples)¶
Messages¶
Paper link:
Wang, X. & Yamagishi, J. Using Cyclic Noise as the Source Signal for Neural Source-Filter-Based Speech Waveform Model. in Proc. Interspeech 1992–1996. doi:10.21437/Interspeech.2020-1018
BibTex:
@inproceedings{wang2020cyclic, address = {ISCA}, author = {Wang, Xin and Yamagishi, Junichi}, booktitle = {Proc. Interspeech}, doi = {10.21437/Interspeech.2020-1018}, pages = {1992--1996}, publisher = {ISCA}, title = {{Using Cyclic Noise as the Source Signal for Neural Source-Filter-Based Speech Waveform Model}}, url = {http://www.isca-speech.org/archive/Interspeech{\_}2020/abstracts/1018.html}, year = {2020} }This page lists samples on CMU_ARCTIC database;
This page lists copy-synthesis waveform samples, i.e., waveforms generated given natural acoustic features. They were evaluated in the listening test
Results of significance test can be found here
Code is available. You need both the CURRENNT toolkit and scripts. This subfolder in the script repository is made for this project
New implementaion based on Pytorch is also available
Slides for Interspeech 2020 presentation can be found on this page. Or you can directly download this PPT or PDF.
Audio samples¶
All models were trained using SLT, CLB, RMS, and BDL in a speaker-independent way.
It may take a few minutes to load all the speech samples. You can also download all samples from this dropbox link.
SLT voice¶
slt_arctic_b0474.wav | slt_arctic_b0476.wav | slt_arctic_b0478.wav | slt_arctic_b0475.wav | slt_arctic_b0477.wav | ||
---|---|---|---|---|---|---|
Nat | Nat | |||||
WaveNet | WaveNet | |||||
Sin | Sin | |||||
Pul | Pul | |||||
Rno | Rno | |||||
Cno\({}_{\beta_1}\) | Cno\({}_{\beta_1}\) | |||||
Cno\({}_{\beta_2}\) | Cno\({}_{\beta_2}\) | |||||
Cno\({}_{\beta_3}\) | Cno\({}_{\beta_3}\) | |||||
Cno\({}_{\beta_{tr}}\) | Cno\({}_{\beta_{tr}}\) | |||||
Rno\({}_{no Lmask}\) | Rno\({}_{no Lmask}\) | |||||
Cno\({}_{\beta_2 no Lmask}\) | Cno\({}_{\beta_2 no Lmask}\) |
CLB voice¶
clb_arctic_b0474.wav | clb_arctic_b0476.wav | clb_arctic_b0478.wav | clb_arctic_b0475.wav | clb_arctic_b0477.wav | ||
---|---|---|---|---|---|---|
Nat | Nat | |||||
WaveNet | WaveNet | |||||
Sin | Sin | |||||
Pul | Pul | |||||
Rno | Rno | |||||
Cno\({}_{\beta_1}\) | Cno\({}_{\beta_1}\) | |||||
Cno\({}_{\beta_2}\) | Cno\({}_{\beta_2}\) | |||||
Cno\({}_{\beta_3}\) | Cno\({}_{\beta_3}\) | |||||
Cno\({}_{\beta_{tr}}\) | Cno\({}_{\beta_{tr}}\) | |||||
Rno\({}_{no Lmask}\) | Rno\({}_{no Lmask}\) | |||||
Cno\({}_{\beta_2 no Lmask}\) | Cno\({}_{\beta_2 no Lmask}\) |
BDL voice¶
bdl_arctic_b0474.wav | bdl_arctic_b0476.wav | bdl_arctic_b0478.wav | bdl_arctic_b0475.wav | bdl_arctic_b0477.wav | ||
---|---|---|---|---|---|---|
Nat | Nat | |||||
WaveNet | WaveNet | |||||
Sin | Sin | |||||
Pul | Pul | |||||
Rno | Rno | |||||
Cno\({}_{\beta_1}\) | Cno\({}_{\beta_1}\) | |||||
Cno\({}_{\beta_2}\) | Cno\({}_{\beta_2}\) | |||||
Cno\({}_{\beta_3}\) | Cno\({}_{\beta_3}\) | |||||
Cno\({}_{\beta_{tr}}\) | Cno\({}_{\beta_{tr}}\) | |||||
Rno\({}_{no Lmask}\) | Rno\({}_{no Lmask}\) | |||||
Cno\({}_{\beta_2 no Lmask}\) | Cno\({}_{\beta_2 no Lmask}\) |
RMS voice¶
rms_arctic_b0474.wav | rms_arctic_b0476.wav | rms_arctic_b0478.wav | rms_arctic_b0475.wav | rms_arctic_b0477.wav | ||
---|---|---|---|---|---|---|
Nat | Nat | |||||
WaveNet | WaveNet | |||||
Sin | Sin | |||||
Pul | Pul | |||||
Rno | Rno | |||||
Cno\({}_{\beta_1}\) | Cno\({}_{\beta_1}\) | |||||
Cno\({}_{\beta_2}\) | Cno\({}_{\beta_2}\) | |||||
Cno\({}_{\beta_3}\) | Cno\({}_{\beta_3}\) | |||||
Cno\({}_{\beta_{tr}}\) | Cno\({}_{\beta_{tr}}\) | |||||
Rno\({}_{no Lmask}\) | Rno\({}_{no Lmask}\) | |||||
Cno\({}_{\beta_2 no Lmask}\) | Cno\({}_{\beta_2 no Lmask}\) |