Audio Samples for MIDI-to-Audio Synthesis

Can Knowledge of End-to-End Text-to-Speech Models Improve Neural MIDI-to-Audio Synthesis Systems?

Authors: Xuan Shi, Erica Cooper, Xin Wang, Junichi Yamagishi, Shrikanth Narayanan

Submitted to ICASSP 2023.

Preprint: arxiv 2211.13868
Open-source code

Notes:

Natural audios are from The MAESTRO Dataset V2.0.0. The MAESTRO dataset is made available by Google LLC under a Creative Commons Attribution Non-Commercial Share-Alike 4.0 (CC BY-NC-SA 4.0) license. Please cite the paper if you use the MAESTRO dataset:

Curtis Hawthorne, Andriy Stasyuk, Adam Roberts, Ian Simon, Cheng-Zhi Anna Huang, Sander Dieleman, Erich Elsen, Jesse Engel, and Douglas Eck. "Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset." In International Conference on Learning Representations, 2019.

We used an open-source software Fluidsynth and a commercial software Pianoteq as reference systems. Fluidsynth comes with GNU LESSER GENERAL PUBLIC LICENSE;

Audio samples other than those from MAESTRO on this website are distributed with under a Creative Commons Attribution Non-Commercial Share-Alike 4.0 (CC BY-NC-SA 4.0) license

List of systems

Audio Samples

Natural
Fluidsynth
Pianoteq
abs-mfbf-nsfs
taco-mfbf-nsfs
abs-mfb-nsfs
abs-mfb-nsf
abs-mfb-nsfg
abs-mfb-hfg
taco-mfb-nsfs
taco-mfb-nsf
taco-mfb-nsfg
taco-mfb-hfg
trans-mfb-nsfs
trans-mfb-nsf
trans-mfb-nsfg
trans-mfb-hfg
joint-nsf
joint-nsfg
joint-hfg