.. samples-xin documentation master file, created by sphinx-quickstart on Sun Apr 25 22:58:24 2021. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. .. _label-nsf-v3: Hn-sinc-NSF *********** Messages -------- * Paper: Wang, X. & Yamagishi, J. Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice Frequency for Text-to-Speech Synthesis. in Proc. SSW 1–6 (ISCA, 2019). `doi:10.21437/SSW.2019-1 `__ * BibTex:: @inproceedings{Wang2019, address = {ISCA}, author = {Wang, Xin and Yamagishi, Junichi}, booktitle = {Proc. SSW}, doi = {10.21437/SSW.2019-1}, pages = {1--6}, publisher = {ISCA}, title = {{Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice Frequency for Text-to-Speech Synthesis}}, url = {http://www.isca-speech.org/archive/SSW{\_}2019/abstracts/SSW10{\_}O{\_}1-1.html}, year = {2019} } * Experiments were based on ATR-Ximera F009 voice (Japanese, commercial database) * Code is available. You need both the `CURRENNT toolkit `_ and `scripts `_. `This subfolder `_ in the script repository is for this project * New implementaion based on Pytorch is also `available `_; * Slides for SSW 2019 presentation can be found on `this page `_. You may also directly download the `PDF `__; * Note that Copy-synthesis refers to waveform generation given natural acoustic features Text-to-speech refers to waveform generation given acoustic features predicted from the text input | Audio samples ------------- Natural waveform samples cannot be released online due to the license issue. .. raw:: html Utterance: _AOZORAR_09534_T01
Mel-spec. + F0 (15-hr data)WaveNethn-NSFh-sinc1-NSFh-sinc2-NSFh-sinc3-NSF
Copy-synthesis:
Text-to-speech:



Utterance: _AOZORAR_03372_T01
Mel-spec. + F0 (15-hr data)WaveNethn-NSFh-sinc1-NSFh-sinc2-NSFh-sinc3-NSF
Copy-synthesis:
Text-to-speech:

Utterance: _NIKKEIR_03132_T01
Mel-spec. + F0 (15-hr data)WaveNethn-NSFh-sinc1-NSFh-sinc2-NSFh-sinc3-NSF
Copy-synthesis:
Text-to-speech:


Utterance: _NIKKEIR_00257_T01
Mel-spec. + F0 (15-hr data)WaveNethn-NSFh-sinc1-NSFh-sinc2-NSFh-sinc3-NSF
Copy-synthesis:
Text-to-speech:



Utterance: _BTEC_00312_T01
Mel-spec. + F0 (15-hr data)WaveNethn-NSFh-sinc1-NSFh-sinc2-NSFh-sinc3-NSF
Copy-synthesis:
Text-to-speech:
.. toctree:: :hidden: :maxdepth: 1