Yamagishi Laboratory, National Institute of Informatics, Japan

researchmap
Associate members researchmap
Follow @yamagishilab

Selected publications and their samples/codes:
"A Multi-Level Attention Model for Evidence-Based Fact Checking"
Canasai Kruengkrai, Junichi Yamagishi, Xin Wang
Findings of ACL 2021
Preprint, Code

"How do Voices from Past Speech Synthesis Challenges Compare Today?"
Erica Cooper, Junichi Yamagishi
Submitted to ISCA Speech Synthesis Workshop 2021
Preprint

"Text-to-Speech Synthesis Techniques for MIDI-to-Audio Synthesis"
Erica Cooper, Xin Wang, Junichi Yamagishi
Submitted to ISCA Speech Synthesis Workshop 2021
Preprint, Samples

"Exploring Disentanglement with Multilingual and Monolingual VQ-VAE"
Jennifer Williams, Jason Fong, Erica Cooper, Junichi Yamagishi
Submitted to ISCA Speech Synthesis Workshop 2021
Preprint, Samples

"Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement"
Haoyu Li, Junichi Yamagishi
Submitted to IEEE/ACM Transactions on Audio Speech and Language Processing
Preprint, Samples

"An Initial Investigation for Detecting Partially Spoofed Audio"
Lin Zhang, Xin Wang, Erica Cooper, Junichi Yamagishi, Jose Patino, Nicholas Evans
Submitted to Interspeech 2021
Preprint, Samples, Database (coming soon)

"A Comparative Study on Recent Neural Spoofing Countermeasures for Synthetic Speech Detection"
Xin Wang, Junich Yamagishi
Submitted to Interspeech 2021
Preprint, Codes

"Attention Back-end for Automatic Speaker Verification with Multiple Enrollment Utterances"
Chang Zeng, Xin Wang, Erica Cooper, Junichi Yamagishi
Submitted to Interspeech 2021
Preprint, Codes

"ASVspoof 2019: spoofing countermeasures for the detection of synthesized, converted and replayed speech"
Andreas Nautsch, Xin Wang, Nicholas Evans, Tomi Kinnunen, Ville Vestman, Massimiliano Todisco, Héctor Delgado, Md Sahidullah, Junichi Yamagishi, Kong Aik Lee
IEEE Transactions on Biometrics, Behavior, and Identity Science
Preprint

"End-to-End Text-to-Speech using Latent Duration based on VQ-VAE"
Yusuke Yasuda, Xin Wang, Junichi Yamagishi
ICASSP 2021
Preprint, Samples

"Learning Disentangled Phone and Speaker Representations in a Semi-Supervised VQ-VAE Paradigm"
Jennifer Williams, Yi Zhao, Erica Cooper, Junichi Yamagishi
ICASSP 2021
Preprint, Samples

"How Similar or Different Is Rakugo Speech Synthesizer to Professional Performers?"
Shuhei Kato, Yusuke Yasuda, Xin Wang, Erica Cooper, Junichi Yamagishi
ICASSP 2021
Preprint

"Enhancing Low-Quality Voice Recordings Using Disentangled Channel Factor and Neural Waveform Model"
Haoyu Li, Yang Ai, Junichi Yamagishi
IEEE SLT 2021
Preprint, Samples

"Denoising-and-Dereverberation Hierarchical Neural Vocoder for Robust Waveform Generation"
Yang Ai, Haoyu Li, Xin Wang, Junichi Yamagishi, Zhenhua Ling
IEEE SLT 2021
Preprint, Samples

"Viable Threat on News Reading: Generating Biased News Using Natural Language Models"
Saurabh Gupta, Huy H. Nguyen, Junichi Yamagishi, Isao Echizen
NLP+CSS Workshop at EMNLP 2020
Preprint

"Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion"
Yi Zhao, Wen-Chin Huang, Xiaohai Tian, Junichi Yamagishi, Rohan Kumar Das, Tomi Kinnunen, Zhenhua Ling, Tomoki Toda
ISCA Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020
Preprint

"Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions"
Rohan Kumar Das, Tomi Kinnunen, Wen-Chin Huang, Zhenhua Ling, Junichi Yamagishi, Yi Zhao, Xiaohai Tian, Tomoki Toda
ISCA Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020
Preprint

"An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning"
Berrak Sisman, Junichi Yamagishi, Simon King, Haizhou Li
IEEE/ACM Transactions on Audio, Speech, and Language Processing
Preprint

"Modeling of Rakugo Speech and Its Limitations: Toward Speech Synthesis That Entertains Audiences"
Shuhei Kato, Yusuke Yasuda, Xin Wang, Erica Cooper, Shinji Takaki, Junichi Yamagishi
IEEE Access
Preprint, Samples

"Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals"
Tomi Kinnunen, Héctor Delgado, Nicholas Evans, Kong Aik Lee, Ville Vestman, Andreas Nautsch, Massimiliano Todisco, Xin Wang, Md Sahidullah, Junichi Yamagishi, Douglas A. Reynolds
IEEE/ACM Transactions on Audio, Speech, and Language Processing
Preprint

"Generating Master Faces for Use in Performing Wolf Attacks on Face Recognition Systems"
Huy H. Nguyen, Junichi Yamagishi, Isao Echizen, Sébastien Marcel
The 2020 International Joint Conference on Biometrics (IJCB 2020)
Preprint

"NAUTILUS: a Versatile Voice Cloning System"
Hieu-Thi Luong, Junichi Yamagishi
IEEE/ACM Transactions on Audio, Speech, and Language Processing
Preprint, samples

"Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis"
Yusuke Yasuda, Xin Wang, Junichi Yamagishi
May 2020, Submitted to Computer Speech and Language
Preprint, samples

"Introducing the VoicePrivacy Initiative"
Natalia Tomashenko, Brij Mohan Lal Srivastava, Xin Wang, Emmanuel Vincent, Andreas Nautsch, Junichi Yamagishi, Nicholas Evans, Jose Patino, Jean-François Bonastre, Paul-Gauthier Noé, Massimiliano Todisco
Interspeech 2020
Preprint, challenge website

"The Privacy ZEBRA: Zero Evidence Biometric Recognition Assessment"
Andreas Nautsch, Jose Patino, Natalia Tomashenko, Junichi Yamagishi, Paul-Gauthier Noe, Jean-Francois Bonastre, Massimiliano Todisco, Nicholas Evans
Interspeech 2020
Preprint

"Design Choices for X-vector Based Speaker Anonymization"
Brij Mohan Lal Srivastava, Natalia Tomashenko, Xin Wang, Emmanuel Vincent, Junichi Yamagishi, Mohamed Maouche, Aurélien Bellet, Marc Tommasi
Interspeech 2020
Preprint

"Improved Prosody from Learned F0 Codebook Representations for VQ-VAE Speech Waveform Reconstruction"
Yi Zhao, Haoyu Li, Cheng-I Lai, Jennifer Williams, Erica Cooper, Junichi Yamagishi
Interspeech 2020
Preprint, samples, codes

"Reverberation Modeling for Source-Filter-based Neural Vocoder"
Yang Ai, Xin Wang, Junichi Yamagishi, Zhen-Hua Ling
Interspeech 2020
Preprint, samples

"Can Speaker Augmentation Improve Multi-Speaker End-to-End TTS?"
Erica Cooper, Cheng-I Lai, Yusuke Yasuda, Junichi Yamagishi
Interspeech 2020
Preprint, samples

"Noise Tokens: Learning Neural Noise Templates for Environment-Aware Speech Enhancement"
Haoyu Li, Junichi Yamagishi
Interspeech 2020
Preprint, samples

"Using Cyclic Noise as the Source Signal for Neural Source-Filter-based Speech Waveform Model"
Xin Wang, Junichi Yamagishi
Interspeech 2020
Preprint, code and samples

"iMetricGAN: Intelligibility Enhancement for Speech-in-Noise using Generative Adversarial Network-based Metric Learning"
Haoyu Li, Szu-Wei Fu, Yu Tsao, Junichi Yamagishi
Interspeech 2020
Preprint, code and pretrained model, samples

"Security of Facial Forensics Models Against Adversarial Attacks"
Rong Huang, Fuming Fang, Huy H. Nguyen, Junichi Yamagishi, Isao Echizen
IEEE International Conference on Image Processing (ICIP 2020)
Preprint, samples

"An initial investigation on optimizing tandem speaker verification and countermeasure systems using reinforcement learning"
Anssi Kanervisto, Ville Hautamäki, Tomi Kinnunen, Junichi Yamagishi
ISCA Odyssey 2020 The Speaker and Language Recognition Workshop
Preprint, code

"ASVspoof 2019: a large-scale public database of synthetic, converted and replayed speech"
Xin Wang, Junichi Yamagishi, Massimiliano Todisco, Hector Delgado, Andreas Nautsch, Nicholas Evans, Md Sahidullah, Ville Vestman, Tomi Kinnunen, Kong Aik Lee, Lauri Juvela, Paavo Alku, Yu-Huai Peng, Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Sebastien Le Maguer, Markus Becker, Fergus Henderson, Rob Clark, Yu Zhang, Quan Wang, Ye Jia, Kai Onuma, Koji Mushika, Takashi Kaneda, Yuan Jiang, Li-Juan Liu, Yi-Chiao Wu, Wen-Chin Huang, Tomoki Toda, Kou Tanaka, Hirokazu Kameoka, Ingmar Steiner, Driss Matrouf, Jean-Francois Bonastre, Avashna Govender, Srikanth Ronanki, Jing-Xuan Zhang, Zhen-Hua Ling
Computer Speech and Language Nov 2020
Preprint, project page, samples

"Generating Sentiment-Preserving Fake Online Reviews Using Neural Language Models and Their Human- and Machine-based Detection"
David Ifeoluwa Adelani, Haotian Mai, Fuming Fang, Huy H. Nguyen, Junichi Yamagishi, Isao Echizen
AINA 2020
Preprint

"Zero-Shot Multi-Speaker Text-To-Speech with State-of-the-art Neural Speaker Embeddings"
Erica Cooper, Cheng-I Lai, Yusuke Yasuda, Fuming Fang, Xin Wang, Nanxin Chen, Junichi Yamagishi
ICASSP 2020
Preprint, samples, codes (speaker encoder), codes (tacotron)

"Transferring neural speech waveform synthesizers to musical instrument sounds generation"
Yi Zhao, Xin Wang, Lauri Juvela, Junichi Yamagishi
ICASSP 2020
Preprint, samples

"Effect of choice of probability distribution, randomness, and search methods for alignment modeling in sequence-to-sequence text-to-speech synthesis using hard alignment"
Yusuke Yasuda, Xin Wang, Junichi Yamagishi
ICASSP 2020
Preprint, samples


See full publications at
Google scholar, ResearchGate, Researchmap, or Edinburgh Research Explore.

Selected tutorials:
"Neural statistical parametric speech synthesis"
Xin Wang
ISCA Odyssey 2020 The Speaker and Language Recognition Workshop, Video

"Neural auto-regressive, source-filter and glottal vocoders for speech and music signals"
Junichi Yamagishi, Xin Wang
2020 Speech Processing Courses in Crete: Neural approaches for speech enhancement, synthesis, and coding, Video

"Tutorial on end-to-end text-to-speech synthesis"
Xin Wang, Yusuke Yasuda
Part 1 – Neural waveform modeling (slides)(video)
Part 2 – Tactron and related end-to-end systems(slides)(video)

See other codes developed by our group at
GitHub

Call for Papers
January 8, 2021, Computer Speech and Language Special Issue Special issue on Voice Privacy
September 30, 2019, Computer Speech and Language Special Issue Special issue on Advances in Automatic Speaker Verification Anti-spoofing
July 1, 2019, Special session at ASRU 2019 - 2019 IEEE Automatic Speech Recognition and Understanding Workshop: ASVspoof 2019: Analysing Operational Settings
May 17, 2019, SSW10 - The 10th ISCA Speech Synthesis Workshop
March 29, 2019, Special session at Interspeech 2019: The 2019 Automatic Speaker Verification Spoofing and Countermeasures Challenge: ASVspof Challenge
November 15, 2018, CSL Special issue on Speaker and language characterization and recognition: voice modeling, conversion, synthesis and ethical aspects

Call for Participants
Call for Participants: ASVspoof 2021 CHALLENGE
Call for Participants: VoicePrivacy Challenge 2020
Call for Participants: Voice Conversion Challenge 2020
Call for Participants: ASVspoof 2019 CHALLENGE: Future horizons in spoofed/fake audio detection

New databases
May 28, 2021, ASVspoof 2021 Challenge - Physical Access Database
May 28, 2021, ASVspoof 2021 Challenge - Speech Deepfake Database
May 28, 2021, ASVspoof 2021 Challenge - Logical Access Database
May 27, 2021, PartialSpoof Database - Partially Spoofed Audio Dataset for Anti-spoofing
May 27, 2021, Voice Conversion Challenge 2020 -- submitted waveforms v1.0.0
Jan 25, 2021, Human Perceptual Assessment Data on ASVspoof2019 LA Database
Dec 18, 2020, Voice Conversion Challenge 2020 database v1.0
Dec 18, 2020, Voice Conversion Challenge 2020 Listening Test Data
Nov 13, 2019, CSTR VCTK Corpus (version 0.92)
June 4, 2019, ASVspoof 2019: The 3rd Automatic Speaker Verification Spoofing and Countermeasures Challenge database
March 6, 2019, Alba speech corpus (Scottish female speaker, four speaking styles)

Past members
researchmap