Yamagishi Laboratory, National Institute of Informatics, Japan

researchmap
Associate members researchmap
Follow @yamagishilab

Papers under review and papers in prepration:
"Joint Speaker Encoder and Neural Back-end Model for Fully End-to-End Automatic Speaker Verification with Multiple Enrollment Utterances"
Chang Zeng, Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi
Submitted to IEEE/ACM Transactions on Audio Speech and Language Processing
Preprint

"The PartialSpoof Database and Countermeasures for the Detection of Short Generated Audio Segments Embedded in a Speech Utterance"
Lin Zhang, Xin Wang, Erica Cooper, Nicholas Evans, Junichi Yamagishi
Submitted to IEEE/ACM Transactions on Audio Speech and Language Processing
Preprint

"Investigating Active-learning-based Training Data Selection for Speech Spoofing Countermeasure"
Xin Wang, Junichi Yamagishi
Work in progress
Preprint

"Joint Noise Reduction and Listening Enhancement for Full-End Speech Enhancement"
Haoyu Li, Yun Liu, Junichi Yamagishi
Work in progress
Preprint

Selected accepted publications and their samples/codes:
"Mitigating the Diminishing Effect of Elastic Weight Consolidation"
Canasai Kruengkrai, Junichi Yamagishi
COLING 2022

"Analyzing Language-Independent Speaker Anonymization Framework under Unseen Conditions"
Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi, Natalia Tomashenko
Interspeech 2022
Preprint

"The VoiceMOS Challenge 2022"
Wen-Chin Huang, Erica Cooper, Yu Tsao, Hsin-Min Wang, Tomoki Toda, Junichi Yamagishi
Interspeech 2022
Preprint, CodaLab, website

"DDS: A new device-degraded speech dataset for speech enhancement"
Haoyu Li, Junichi Yamagishi
Interspeech 2022

Preprint, Database DDS database (DAPS portion). DDS database (VCTK portion part 1). DDS database (VCTK portion part 2)
"Spoofing-Aware Attention based ASV Back-end with Multiple Enrollment Utterances and a Sampling Strategy for the SASV Challenge 2022"
Chang Zeng, Lin Zhang, Meng Liu, Junichi Yamagishi
Interspeech 2022

"Language-Independent Speaker Anonymization Approach using Self-Supervised Pre-Trained Models"
Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi, N. Tomashenko
ISCA Speaker Odyssey Workshop 2022
Preprint, Samples

"Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation"
Hemlata Tak, Massimiliano Todisco, Xin Wang, Jee-weon Jung, Junichi Yamagishi, Nicholas Evans
ISCA Speaker Odyssey Workshop 2022
Preprint

"Investigating self-supervised front ends for speech spoofing countermeasures"
Xin Wang, Junichi Yamagishi
ISCA Speaker Odyssey Workshop 2022
Preprint

"Master Face Attacks on Face Recognition Systems"
Huy H. Nguyen, Sébastien Marcel, Junichi Yamagishi, Isao Echizen
IEEE Transactions on Biometrics, Behavior, and Identity Science
Paper

"The VoicePrivacy 2020 Challenge: Results and findings"
Natalia Tomashenko, Xin Wang, Emmanuel Vincent, Jose Patino, Brij Mohan Lal Srivastava, Paul-Gauthier Noé, Andreas Nautsch, Nicholas Evans, Junichi Yamagishi, Benjamin O'Brien, Anaïs Chanclu, Jean-François Bonastre, Massimiliano Todisco, Mohamed Maouche
The Special Issue on Voice Privacy (Computer Speech and Language Journal - Elsevier)
Paper, Preprint, challenge website

"SVSNet: An End-to-end Speaker Voice Similarity Assessment Model"
Cheng-Hung Hu, Yu-Huai Peng, Junichi Yamagishi, Yu Tsao, Hsin-Min Wang
IEEE Signal Processing Letters
Preprint

"Estimating the confidence of speech spoofing countermeasure"
Xin Wang, Junichi Yamagishi
ICASSP 2022
Preprint

"Generalization Ability of MOS Prediction Networks"
Erica Cooper, Wen-Chin Huang, Tomoki Toda, Junichi Yamagishi
ICASSP 2022
Preprint

"On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis"
Cheng-I Jeff Lai, Erica Cooper, Yang Zhang, Shiyu Chang, Kaizhi Qian, Yi-Lun Liao, Yung-Sung Chuang, Alexander H. Liu, Junichi Yamagishi, David Cox, James Glass
ICASSP 2022
Preprint

"Attention Back-end for Automatic Speaker Verification with Multiple Enrollment Utterances"
Chang Zeng, Xin Wang, Erica Cooper, Xiaoxiao Miao, Junichi Yamagishi
ICASSP 2022
Preprint, Codes

"Use of speaker recognition approaches for learning and evaluating embedding representations of musical instrument sounds"
Xuan Shi, Erica Cooper, Junichi Yamagishi
IEEE/ACM Transactions on Audio Speech and Language Processing
paper

"Optimizing Tandem Speaker Verification and Anti-Spoofing Systems"
Anssi Kanervisto, Ville Hautamäki, Tomi Kinnunen, Junichi Yamagishi
IEEE/ACM Transactions on Audio Speech and Language Processing
paper

"Effectiveness of Detection-based and Regression-based Approaches for Estimating Mask-Wearing Ratio"
Khanh-Duy Nguyen, Huy H. Nguyen, Trung-Nghia Le, Junichi Yamagishi, Isao Echizen
The International Workshop on Face and Gesture Analysis for COVID-19 (FG4COVID19) held in conjunction with FG 2021
Preprint

"Revisiting Speech Content Privacy"
Jennifer Williams, Junichi Yamagishi, Paul-Gauthier Noe, Cassia Valentini Botinhao, Jean-Francois Bonastre
1st ISCA Symposium on Security and Privacy in Speech Communication
Preprint

"Benchmarking and challenges in security and privacy for voice biometrics"
Jean-Francois Bonastre, Héctor Delgado, Nicholas Evans, Tomi Kinnunen, Xuechen Liu, Andreas Nautsch, Paul-Gauthier Noé, Jose Patino, Md Sahidullah, Brij Mohan Lal Srivastava, Paul-Gauthier Noé, Kong Aik Lee, Massimiliano Todisco, Natalia Tomashenko, Emmanuel Vincent, Xin Wang, Junichi Yamagishi
1st ISCA Symposium on Security and Privacy in Speech Communication
Preprint

"ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection"
Junichi Yamagishi, Xin Wang, Massimiliano Todisco, Md Sahidullah, Jose Patino, Andreas Nautsch, Xuechen Liu, Kong Aik Lee, Tomi Kinnunen, Nicholas Evans, Héctor Delgado
The ASVspoof 2021 Workshop
Preprint challenge website

"Multi-Task Learning in Utterance-Level and Segmental-Level Spoof Detection"
Lin Zhang, Xin Wang, Erica Cooper, Junichi Yamagishi
The ASVspoof 2021 Workshop
Preprint, Database

"OpenForensics: Large-Scale Challenging Dataset For Multi-Face Forgery Detection And Segmentation In-The-Wild"
Trung-Nghia Le, Huy H. Nguyen, Junichi Yamagishi, Isao Echizen
ICCV 2021
Preprint, Database

"A Multi-Level Attention Model for Evidence-Based Fact Checking"
Canasai Kruengkrai, Junichi Yamagishi, Xin Wang
Findings of ACL 2021
Preprint, Code

"How do Voices from Past Speech Synthesis Challenges Compare Today?"
Erica Cooper, Junichi Yamagishi
ISCA Speech Synthesis Workshop 2021
Preprint

"Text-to-Speech Synthesis Techniques for MIDI-to-Audio Synthesis"
Erica Cooper, Xin Wang, Junichi Yamagishi
ISCA Speech Synthesis Workshop 2021
Preprint, Samples

"Exploring Disentanglement with Multilingual and Monolingual VQ-VAE"
Jennifer Williams, Jason Fong, Erica Cooper, Junichi Yamagishi
ISCA Speech Synthesis Workshop 2021
Preprint, Samples

"Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement"
Haoyu Li, Junichi Yamagishi
IEEE/ACM Transactions on Audio Speech and Language Processing
Preprint, Samples, Codes

"An Initial Investigation for Detecting Partially Spoofed Audio"
Lin Zhang, Xin Wang, Erica Cooper, Junichi Yamagishi, Jose Patino, Nicholas Evans
Interspeech 2021
Preprint, Samples, Database

"A Comparative Study on Recent Neural Spoofing Countermeasures for Synthetic Speech Detection"
Xin Wang, Junich Yamagishi
Interspeech 2021
Preprint, Codes

"ASVspoof 2019: spoofing countermeasures for the detection of synthesized, converted and replayed speech"
Andreas Nautsch, Xin Wang, Nicholas Evans, Tomi Kinnunen, Ville Vestman, Massimiliano Todisco, Héctor Delgado, Md Sahidullah, Junichi Yamagishi, Kong Aik Lee
IEEE Transactions on Biometrics, Behavior, and Identity Science
Preprint

"End-to-End Text-to-Speech using Latent Duration based on VQ-VAE"
Yusuke Yasuda, Xin Wang, Junichi Yamagishi
ICASSP 2021
Preprint, Samples

"Learning Disentangled Phone and Speaker Representations in a Semi-Supervised VQ-VAE Paradigm"
Jennifer Williams, Yi Zhao, Erica Cooper, Junichi Yamagishi
ICASSP 2021
Preprint, Samples

"How Similar or Different Is Rakugo Speech Synthesizer to Professional Performers?"
Shuhei Kato, Yusuke Yasuda, Xin Wang, Erica Cooper, Junichi Yamagishi
ICASSP 2021
Preprint

"Enhancing Low-Quality Voice Recordings Using Disentangled Channel Factor and Neural Waveform Model"
Haoyu Li, Yang Ai, Junichi Yamagishi
IEEE SLT 2021
Preprint, Samples

"Denoising-and-Dereverberation Hierarchical Neural Vocoder for Robust Waveform Generation"
Yang Ai, Haoyu Li, Xin Wang, Junichi Yamagishi, Zhenhua Ling
IEEE SLT 2021
Preprint, Samples

"Viable Threat on News Reading: Generating Biased News Using Natural Language Models"
Saurabh Gupta, Huy H. Nguyen, Junichi Yamagishi, Isao Echizen
NLP+CSS Workshop at EMNLP 2020
Preprint

"Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion"
Yi Zhao, Wen-Chin Huang, Xiaohai Tian, Junichi Yamagishi, Rohan Kumar Das, Tomi Kinnunen, Zhenhua Ling, Tomoki Toda
ISCA Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020
Preprint

"Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions"
Rohan Kumar Das, Tomi Kinnunen, Wen-Chin Huang, Zhenhua Ling, Junichi Yamagishi, Yi Zhao, Xiaohai Tian, Tomoki Toda
ISCA Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020
Preprint

"An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning"
Berrak Sisman, Junichi Yamagishi, Simon King, Haizhou Li
IEEE/ACM Transactions on Audio, Speech, and Language Processing
Preprint

"Modeling of Rakugo Speech and Its Limitations: Toward Speech Synthesis That Entertains Audiences"
Shuhei Kato, Yusuke Yasuda, Xin Wang, Erica Cooper, Shinji Takaki, Junichi Yamagishi
IEEE Access
Preprint, Samples

"Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals"
Tomi Kinnunen, Héctor Delgado, Nicholas Evans, Kong Aik Lee, Ville Vestman, Andreas Nautsch, Massimiliano Todisco, Xin Wang, Md Sahidullah, Junichi Yamagishi, Douglas A. Reynolds
IEEE/ACM Transactions on Audio, Speech, and Language Processing
Preprint

"Generating Master Faces for Use in Performing Wolf Attacks on Face Recognition Systems"
Huy H. Nguyen, Junichi Yamagishi, Isao Echizen, Sébastien Marcel
The 2020 International Joint Conference on Biometrics (IJCB 2020)
Preprint

"NAUTILUS: a Versatile Voice Cloning System"
Hieu-Thi Luong, Junichi Yamagishi
IEEE/ACM Transactions on Audio, Speech, and Language Processing
Preprint, samples

"Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis"
Yusuke Yasuda, Xin Wang, Junichi Yamagishi
Computer Speech and Language
Preprint, samples

"Introducing the VoicePrivacy Initiative"
Natalia Tomashenko, Brij Mohan Lal Srivastava, Xin Wang, Emmanuel Vincent, Andreas Nautsch, Junichi Yamagishi, Nicholas Evans, Jose Patino, Jean-François Bonastre, Paul-Gauthier Noé, Massimiliano Todisco
Interspeech 2020
Preprint, challenge website

"The Privacy ZEBRA: Zero Evidence Biometric Recognition Assessment"
Andreas Nautsch, Jose Patino, Natalia Tomashenko, Junichi Yamagishi, Paul-Gauthier Noe, Jean-Francois Bonastre, Massimiliano Todisco, Nicholas Evans
Interspeech 2020
Preprint

"Design Choices for X-vector Based Speaker Anonymization"
Brij Mohan Lal Srivastava, Natalia Tomashenko, Xin Wang, Emmanuel Vincent, Junichi Yamagishi, Mohamed Maouche, Aurélien Bellet, Marc Tommasi
Interspeech 2020
Preprint

"Improved Prosody from Learned F0 Codebook Representations for VQ-VAE Speech Waveform Reconstruction"
Yi Zhao, Haoyu Li, Cheng-I Lai, Jennifer Williams, Erica Cooper, Junichi Yamagishi
Interspeech 2020
Preprint, samples, codes

"Reverberation Modeling for Source-Filter-based Neural Vocoder"
Yang Ai, Xin Wang, Junichi Yamagishi, Zhen-Hua Ling
Interspeech 2020
Preprint, samples

"Can Speaker Augmentation Improve Multi-Speaker End-to-End TTS?"
Erica Cooper, Cheng-I Lai, Yusuke Yasuda, Junichi Yamagishi
Interspeech 2020
Preprint, samples

"Noise Tokens: Learning Neural Noise Templates for Environment-Aware Speech Enhancement"
Haoyu Li, Junichi Yamagishi
Interspeech 2020
Preprint, samples

"Using Cyclic Noise as the Source Signal for Neural Source-Filter-based Speech Waveform Model"
Xin Wang, Junichi Yamagishi
Interspeech 2020
Preprint, code and samples

"iMetricGAN: Intelligibility Enhancement for Speech-in-Noise using Generative Adversarial Network-based Metric Learning"
Haoyu Li, Szu-Wei Fu, Yu Tsao, Junichi Yamagishi
Interspeech 2020
Preprint, code and pretrained model, samples

"Security of Facial Forensics Models Against Adversarial Attacks"
Rong Huang, Fuming Fang, Huy H. Nguyen, Junichi Yamagishi, Isao Echizen
IEEE International Conference on Image Processing (ICIP 2020)
Preprint, samples

"An initial investigation on optimizing tandem speaker verification and countermeasure systems using reinforcement learning"
Anssi Kanervisto, Ville Hautamäki, Tomi Kinnunen, Junichi Yamagishi
ISCA Odyssey 2020 The Speaker and Language Recognition Workshop
Preprint, code

"ASVspoof 2019: a large-scale public database of synthetic, converted and replayed speech"
Xin Wang, Junichi Yamagishi, Massimiliano Todisco, Hector Delgado, Andreas Nautsch, Nicholas Evans, Md Sahidullah, Ville Vestman, Tomi Kinnunen, Kong Aik Lee, Lauri Juvela, Paavo Alku, Yu-Huai Peng, Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Sebastien Le Maguer, Markus Becker, Fergus Henderson, Rob Clark, Yu Zhang, Quan Wang, Ye Jia, Kai Onuma, Koji Mushika, Takashi Kaneda, Yuan Jiang, Li-Juan Liu, Yi-Chiao Wu, Wen-Chin Huang, Tomoki Toda, Kou Tanaka, Hirokazu Kameoka, Ingmar Steiner, Driss Matrouf, Jean-Francois Bonastre, Avashna Govender, Srikanth Ronanki, Jing-Xuan Zhang, Zhen-Hua Ling
Computer Speech and Language Nov 2020
Preprint, project page, samples

"Generating Sentiment-Preserving Fake Online Reviews Using Neural Language Models and Their Human- and Machine-based Detection"
David Ifeoluwa Adelani, Haotian Mai, Fuming Fang, Huy H. Nguyen, Junichi Yamagishi, Isao Echizen
AINA 2020
Preprint

"Zero-Shot Multi-Speaker Text-To-Speech with State-of-the-art Neural Speaker Embeddings"
Erica Cooper, Cheng-I Lai, Yusuke Yasuda, Fuming Fang, Xin Wang, Nanxin Chen, Junichi Yamagishi
ICASSP 2020
Preprint, samples, codes (speaker encoder), codes (tacotron)

"Transferring neural speech waveform synthesizers to musical instrument sounds generation"
Yi Zhao, Xin Wang, Lauri Juvela, Junichi Yamagishi
ICASSP 2020
Preprint, samples

"Effect of choice of probability distribution, randomness, and search methods for alignment modeling in sequence-to-sequence text-to-speech synthesis using hard alignment"
Yusuke Yasuda, Xin Wang, Junichi Yamagishi
ICASSP 2020
Preprint, samples


See full publications at
Google scholar, ResearchGate, Researchmap, or Edinburgh Research Explore.

Selected tutorials:
"Neural statistical parametric speech synthesis"
Xin Wang
ISCA Odyssey 2020 The Speaker and Language Recognition Workshop, Video

"Neural auto-regressive, source-filter and glottal vocoders for speech and music signals"
Junichi Yamagishi, Xin Wang
2020 Speech Processing Courses in Crete: Neural approaches for speech enhancement, synthesis, and coding, Video

"Tutorial on end-to-end text-to-speech synthesis"
Xin Wang, Yusuke Yasuda
Part 1 – Neural waveform modeling (slides)(video)
Part 2 – Tactron and related end-to-end systems(slides)(video)

See other codes developed by our group at
GitHub

Call for Papers
January 8, 2021, Computer Speech and Language Special Issue Special issue on Voice Privacy
September 30, 2019, Computer Speech and Language Special Issue Special issue on Advances in Automatic Speaker Verification Anti-spoofing
July 1, 2019, Special session at ASRU 2019 - 2019 IEEE Automatic Speech Recognition and Understanding Workshop: ASVspoof 2019: Analysing Operational Settings
May 17, 2019, SSW10 - The 10th ISCA Speech Synthesis Workshop
March 29, 2019, Special session at Interspeech 2019: The 2019 Automatic Speaker Verification Spoofing and Countermeasures Challenge: ASVspof Challenge
November 15, 2018, CSL Special issue on Speaker and language characterization and recognition: voice modeling, conversion, synthesis and ethical aspects

Call for Participants
Call for Participants: ASVspoof 2021 CHALLENGE
Call for Participants: VoicePrivacy Challenge 2020
Call for Participants: Voice Conversion Challenge 2020
Call for Participants: ASVspoof 2019 CHALLENGE: Future horizons in spoofed/fake audio detection

New databases
October 13, 2021 OpenForensics: Multi-Face Forgery Detection And Segmentation In-The-Wild Dataset
September 20, 2021, DDS (Device-Degraded Speech) Dataset (DAPS portion). (VCTK portion part 1). (VCTK portion part 2)
July 1, 2021, Complete DR-VCTK dataset
May 28, 2021, ASVspoof 2021 Challenge - Physical Access Database
May 28, 2021, ASVspoof 2021 Challenge - Speech Deepfake Database
May 28, 2021, ASVspoof 2021 Challenge - Logical Access Database
May 27, 2021, PartialSpoof Database - Partially Spoofed Audio Dataset for Anti-spoofing
May 27, 2021, Voice Conversion Challenge 2020 -- submitted waveforms v1.0.0
Jan 25, 2021, Human Perceptual Assessment Data on ASVspoof2019 LA Database
Dec 18, 2020, Voice Conversion Challenge 2020 database v1.0
Dec 18, 2020, Voice Conversion Challenge 2020 Listening Test Data
Nov 13, 2019, CSTR VCTK Corpus (version 0.92)
June 4, 2019, ASVspoof 2019: The 3rd Automatic Speaker Verification Spoofing and Countermeasures Challenge database
March 6, 2019, Alba speech corpus (Scottish female speaker, four speaking styles)

Past members
researchmap