.. samples-xin documentation master file, created by sphinx-quickstart on Sun Apr 25 22:58:24 2021. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. .. _label-nsf-v4_vctk: Cyclic-noise-NSF (VCTK samples) ******************************* Messages -------- * Paper link: Wang, X. & Yamagishi, J. Using Cyclic Noise as the Source Signal for Neural Source-Filter-Based Speech Waveform Model. in Proc. Interspeech 1992–1996. `doi:10.21437/Interspeech.2020-1018 `__ * BibTex:: @inproceedings{wang2020cyclic, address = {ISCA}, author = {Wang, Xin and Yamagishi, Junichi}, booktitle = {Proc. Interspeech}, doi = {10.21437/Interspeech.2020-1018}, pages = {1992--1996}, publisher = {ISCA}, title = {{Using Cyclic Noise as the Source Signal for Neural Source-Filter-Based Speech Waveform Model}}, url = {http://www.isca-speech.org/archive/Interspeech{\_}2020/abstracts/1018.html}, year = {2020} } * This page lists samples on `VCTK database `_ * This page lists copy-synthesis waveform samples, i.e., waveforms generated given natural acoustic features. They were evaluated in the listening test * Code is available. You need both the `CURRENNT toolkit `_ and `scripts `_. `This subfolder `_ in the script repository is made for this project * New implementaion based on Pytorch is also `available `_ * Slides for Interspeech 2020 presentation can be found on `this page `_. Or you can directly download this `PPT `__ or `PDF `__. | Audio samples ------------- Models were trained on VCTK v0.9, 87 trainining speakers, 200 utterances/speaker, 16kHz waveforms, speaker-independently. You can find pre-trained models in git repository above. It may take a few minutes to load all the speech samples. You can also download all samples from `this dropbox link `_. Seen speakers ============= Test set samples for seen speakers (i.e., speakers who provided training data). .. raw:: html
NaturalWaveNetSin
(hn-sinc-NSF sine-source)
Cno\({}_{\beta_2}\)
(hn-sinc-NSF cyclic-noise-source)
p229_290.wav p229_290.wav
p243_368.wav p243_368.wav
p250_305.wav p250_305.wav
p268_264.wav p268_264.wav
p285_241.wav p285_241.wav
p306_293.wav p306_293.wav
p323_422.wav p323_422.wav
p240_268.wav p240_268.wav
p244_409.wav p244_409.wav
p255_375.wav p255_375.wav
p270_310.wav p270_310.wav
p297_328.wav p297_328.wav
p311_330.wav p311_330.wav
p347_331.wav p347_331.wav
p241_325.wav p241_325.wav
p247_355.wav p247_355.wav
p267_273.wav p267_273.wav
p276_363.wav p276_363.wav
p305_410.wav p305_410.wav
p314_320.wav p314_320.wav
Unseen speakers =============== Test set samples for unseen speakers (i.e., speakers who do not provide any training data). .. raw:: html
NaturalWaveNetSin
(hn-sinc-NSF sine-source)
Cno\({}_{\beta_2}\)
(hn-sinc-NSF cyclic-noise-source)
p251_341.wav p251_341.wav
p253_390.wav p253_390.wav
p254_384.wav p254_384.wav
p257_329.wav p257_329.wav
p258_231.wav p258_231.wav
p262_356.wav p262_356.wav
p265_319.wav p265_319.wav
p272_247.wav p272_247.wav
p279_362.wav p279_362.wav
p293_282.wav p293_282.wav
p303_317.wav p303_317.wav
p307_336.wav p307_336.wav
p310_359.wav p310_359.wav
p329_256.wav p329_256.wav
p330_145.wav p330_145.wav
p335_239.wav p335_239.wav
p336_382.wav p336_382.wav
p345_079.wav p345_079.wav
p364_116.wav p364_116.wav
p374_193.wav p374_193.wav
.. toctree:: :hidden: :maxdepth: 1