We warmly invite you to participate in the “VoiceMOS Challenge”, a challenge which we plan to propose as a special session in INTERSPEECH 2022.
The purpose of this challenge is to compare different systems and approaches on the task of predicting the mean opinion score (MOS) of synthetic speech. With recent interest in data-driven approaches for MOS prediction using machine learning, a challenge for this task to encourage research in this area is timely and important. We recently collected a large-scale dataset of MOS ratings for a large variety of text-to-speech and voice conversion systems spanning many years. This challenge releases this data to the public for the first time.
The challenge will be divided into a main track and an “out-of-domain” (OOD) sub-track. Participants of both tracks will be provided with a shared training and validation set curated from the above-mentioned dataset. The main track will use the test set from the same corpus, thus amounting to in-domain evaluation. The OOD sub-track, on the other hand, will use out-of-domain (OOD) datasets for evaluation from a separate listening test on different audio samples. We plan to use the CodaLab (https://competitions.codalab.org/) as the competition platform, and we will provide a few baseline scripts for training MOS prediction systems and generating sample submissions.
A tentative schedule is as follows:
Likewise with many challenges, there is no participation fee. If you might be interested in participating in the challenge and writing a paper for the special session, please kindly contact us at voicemos2022@nii.ac.jp
Please freely contact us if you have questions at voicemos2022@nii.ac.jp.
Looking forward to hearing from you.
Regards,
The VoiceMOS Challenge organizing committee
Erica Cooper, Wen-Chin Huang, Yu Tsao, Hsin-Min Wang, Junichi
Yamagishi, and Tomoki Toda