Speech samples for "Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language"

Authors: Yusuke Yasuda, Xin Wang, Shinji Takaki, Junichi Yamagishi

Note that our work in the paper uses a proprietary Japanese speech corpus with manually annotated labels. Because we cannot provide an exact reproducer in public, this pasge demonstrates samples from publicly available corpus. We plan to support additional public datasets in the future.

You can find the source code here https://github.com/nii-yamagishilab/self-attention-tacotron.

For details about the systems please refer to our paper: http://arxiv.org/abs/1810.11960.

LJSpeech (English)

1: and was used there with very little variation all through the sixteenth and seventeenth centuries, and indeed into the eighteenth.
natural Self-attention Tacotron

2: whilst Ludgate, the Giltspur Street, and Borough Compters also received them
natural Self-attention Tacotron

3: The yards were taken up with rackets and five courts, and here and there were "bumble puppy grounds," a game in which the players rolled iron balls
natural Self-attention Tacotron

4: Each ward was calculated to hold twenty-four, allowing each individual one foot and a half;
natural Self-attention Tacotron

5: The old press yard has been fully described in a previous chapter.
natural Self-attention Tacotron

6: that when they were not carousing, plotting, or scheming,
natural Self-attention Tacotron