Yamagishi Laboratory, National Institute of Informatics, Japan

researchmap
Associate members researchmap
Follow @yamagishilab

Papers under review and papers in prepration:
"Hiding speaker's sex in speech using zero-evidence speaker representation in an analysis/synthesis pipeline"
Paul-Gauthier Noé, Xiaoxiao Miao, Xin Wang, Junichi Yamagishi, Jean-François Bonastre, Driss Matrouf
Submitted to ICASSP 2023
Preprint, Codes and samples, Pre-trained models

"Can Knowledge of End-to-End Text-to-Speech Models Improve Neural MIDI-to-Audio Synthesis Systems?"
Xuan Shi, Erica Cooper, Xin Wang, Junichi Yamagishi, Shrikanth Narayanan
Submitted to ICASSP 2023
Preprint, Samples, Pre-trained models

"Spoofed training data for speech spoofing countermeasure can be efficiently created using neural vocoders"
Xin Wang, Junichi Yamagishi
Submitted to ICASSP 2023
Preprint

"Joint Speaker Encoder and Neural Back-end Model for Fully End-to-End Automatic Speaker Verification with Multiple Enrollment Utterances"
Chang Zeng, Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi
Submitted to IEEE/ACM Transactions on Audio Speech and Language Processing
Preprint

"Joint Noise Reduction and Listening Enhancement for Full-End Speech Enhancement"
Haoyu Li, Yun Liu, Junichi Yamagishi
Work in progress
Preprint

Selected accepted publications and their samples/codes:
"The PartialSpoof Database and Countermeasures for the Detection of Short Generated Audio Segments Embedded in a Speech Utterance"
Lin Zhang, Xin Wang, Erica Cooper, Nicholas Evans, Junichi Yamagishi
IEEE/ACM Transactions on Audio Speech and Language Processing
Preprint

"Outlier-Aware Training for Improving Group Accuracy Disparities"
Li-Kuang Chen, Canasai Kruengkrai, Junichi Yamagishi
AACL-IJCNLP 2022 Student Research Workshop (SRW)
Preprint, Codes, Pre-trained models

"Investigating Active-learning-based Training Data Selection for Speech Spoofing Countermeasure"
Xin Wang, Junichi Yamagishi
The 2022 IEEE Spoken Language Technology Workshop (SLT 2022)
Preprint

"Mitigating the Diminishing Effect of Elastic Weight Consolidation"
Canasai Kruengkrai, Junichi Yamagishi
COLING 2022
PDF, Codes

"Analyzing Language-Independent Speaker Anonymization Framework under Unseen Conditions"
Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi, Natalia Tomashenko
Interspeech 2022
Preprint, Samples

"The VoiceMOS Challenge 2022"
Wen-Chin Huang, Erica Cooper, Yu Tsao, Hsin-Min Wang, Tomoki Toda, Junichi Yamagishi
Interspeech 2022
Preprint, CodaLab, website

"DDS: A new device-degraded speech dataset for speech enhancement"
Haoyu Li, Junichi Yamagishi
Interspeech 2022
Preprint, Database DDS database (DAPS portion). DDS database (VCTK portion part 1). DDS database (VCTK portion part 2)

"Spoofing-Aware Attention based ASV Back-end with Multiple Enrollment Utterances and a Sampling Strategy for the SASV Challenge 2022"
Chang Zeng, Lin Zhang, Meng Liu, Junichi Yamagishi
Interspeech 2022
Preprint
"Language-Independent Speaker Anonymization Approach using Self-Supervised Pre-Trained Models"
Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi, N. Tomashenko
ISCA Speaker Odyssey Workshop 2022
Preprint, Samples, Codes

"Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation"
Hemlata Tak, Massimiliano Todisco, Xin Wang, Jee-weon Jung, Junichi Yamagishi, Nicholas Evans
ISCA Speaker Odyssey Workshop 2022
Preprint

"Investigating self-supervised front ends for speech spoofing countermeasures"
Xin Wang, Junichi Yamagishi
ISCA Speaker Odyssey Workshop 2022
Preprint

"Master Face Attacks on Face Recognition Systems"
Huy H. Nguyen, Sébastien Marcel, Junichi Yamagishi, Isao Echizen
IEEE Transactions on Biometrics, Behavior, and Identity Science
Paper

"The VoicePrivacy 2020 Challenge: Results and findings"
Natalia Tomashenko, Xin Wang, Emmanuel Vincent, Jose Patino, Brij Mohan Lal Srivastava, Paul-Gauthier Noé, Andreas Nautsch, Nicholas Evans, Junichi Yamagishi, Benjamin O'Brien, Anaïs Chanclu, Jean-François Bonastre, Massimiliano Todisco, Mohamed Maouche
The Special Issue on Voice Privacy (Computer Speech and Language Journal - Elsevier)
Paper, Preprint, challenge website

"SVSNet: An End-to-end Speaker Voice Similarity Assessment Model"
Cheng-Hung Hu, Yu-Huai Peng, Junichi Yamagishi, Yu Tsao, Hsin-Min Wang
IEEE Signal Processing Letters
Preprint

"Estimating the confidence of speech spoofing countermeasure"
Xin Wang, Junichi Yamagishi
ICASSP 2022
Preprint

"Generalization Ability of MOS Prediction Networks"
Erica Cooper, Wen-Chin Huang, Tomoki Toda, Junichi Yamagishi
ICASSP 2022
Preprint, Codes

"On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis"
Cheng-I Jeff Lai, Erica Cooper, Yang Zhang, Shiyu Chang, Kaizhi Qian, Yi-Lun Liao, Yung-Sung Chuang, Alexander H. Liu, Junichi Yamagishi, David Cox, James Glass
ICASSP 2022
Preprint, Samples

"Attention Back-end for Automatic Speaker Verification with Multiple Enrollment Utterances"
Chang Zeng, Xin Wang, Erica Cooper, Xiaoxiao Miao, Junichi Yamagishi
ICASSP 2022
Preprint, Codes

"Use of speaker recognition approaches for learning and evaluating embedding representations of musical instrument sounds"
Xuan Shi, Erica Cooper, Junichi Yamagishi
IEEE/ACM Transactions on Audio Speech and Language Processing
paper

"Optimizing Tandem Speaker Verification and Anti-Spoofing Systems"
Anssi Kanervisto, Ville Hautamäki, Tomi Kinnunen, Junichi Yamagishi
IEEE/ACM Transactions on Audio Speech and Language Processing
paper

"Effectiveness of Detection-based and Regression-based Approaches for Estimating Mask-Wearing Ratio"
Khanh-Duy Nguyen, Huy H. Nguyen, Trung-Nghia Le, Junichi Yamagishi, Isao Echizen
The International Workshop on Face and Gesture Analysis for COVID-19 (FG4COVID19) held in conjunction with FG 2021
Preprint

"Revisiting Speech Content Privacy"
Jennifer Williams, Junichi Yamagishi, Paul-Gauthier Noe, Cassia Valentini Botinhao, Jean-Francois Bonastre
1st ISCA Symposium on Security and Privacy in Speech Communication
Preprint

"Benchmarking and challenges in security and privacy for voice biometrics"
Jean-Francois Bonastre, Héctor Delgado, Nicholas Evans, Tomi Kinnunen, Xuechen Liu, Andreas Nautsch, Paul-Gauthier Noé, Jose Patino, Md Sahidullah, Brij Mohan Lal Srivastava, Paul-Gauthier Noé, Kong Aik Lee, Massimiliano Todisco, Natalia Tomashenko, Emmanuel Vincent, Xin Wang, Junichi Yamagishi
1st ISCA Symposium on Security and Privacy in Speech Communication
Preprint

"ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection"
Junichi Yamagishi, Xin Wang, Massimiliano Todisco, Md Sahidullah, Jose Patino, Andreas Nautsch, Xuechen Liu, Kong Aik Lee, Tomi Kinnunen, Nicholas Evans, Héctor Delgado
The ASVspoof 2021 Workshop
Preprint challenge website

"Multi-Task Learning in Utterance-Level and Segmental-Level Spoof Detection"
Lin Zhang, Xin Wang, Erica Cooper, Junichi Yamagishi
The ASVspoof 2021 Workshop
Preprint, Database

"OpenForensics: Large-Scale Challenging Dataset For Multi-Face Forgery Detection And Segmentation In-The-Wild"
Trung-Nghia Le, Huy H. Nguyen, Junichi Yamagishi, Isao Echizen
ICCV 2021
Preprint, Database

"A Multi-Level Attention Model for Evidence-Based Fact Checking"
Canasai Kruengkrai, Junichi Yamagishi, Xin Wang
Findings of ACL 2021
Preprint, Code

"How do Voices from Past Speech Synthesis Challenges Compare Today?"
Erica Cooper, Junichi Yamagishi
ISCA Speech Synthesis Workshop 2021
Preprint

"Text-to-Speech Synthesis Techniques for MIDI-to-Audio Synthesis"
Erica Cooper, Xin Wang, Junichi Yamagishi
ISCA Speech Synthesis Workshop 2021
Preprint, Samples

"Exploring Disentanglement with Multilingual and Monolingual VQ-VAE"
Jennifer Williams, Jason Fong, Erica Cooper, Junichi Yamagishi
ISCA Speech Synthesis Workshop 2021
Preprint, Samples

"Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement"
Haoyu Li, Junichi Yamagishi
IEEE/ACM Transactions on Audio Speech and Language Processing
Preprint, Samples, Codes

"An Initial Investigation for Detecting Partially Spoofed Audio"
Lin Zhang, Xin Wang, Erica Cooper, Junichi Yamagishi, Jose Patino, Nicholas Evans
Interspeech 2021
Preprint, Samples, Database

"A Comparative Study on Recent Neural Spoofing Countermeasures for Synthetic Speech Detection"
Xin Wang, Junich Yamagishi
Interspeech 2021
Preprint, Codes

"ASVspoof 2019: spoofing countermeasures for the detection of synthesized, converted and replayed speech"
Andreas Nautsch, Xin Wang, Nicholas Evans, Tomi Kinnunen, Ville Vestman, Massimiliano Todisco, Héctor Delgado, Md Sahidullah, Junichi Yamagishi, Kong Aik Lee
IEEE Transactions on Biometrics, Behavior, and Identity Science
Preprint

"End-to-End Text-to-Speech using Latent Duration based on VQ-VAE"
Yusuke Yasuda, Xin Wang, Junichi Yamagishi
ICASSP 2021
Preprint, Samples

"Learning Disentangled Phone and Speaker Representations in a Semi-Supervised VQ-VAE Paradigm"
Jennifer Williams, Yi Zhao, Erica Cooper, Junichi Yamagishi
ICASSP 2021
Preprint, Samples

"How Similar or Different Is Rakugo Speech Synthesizer to Professional Performers?"
Shuhei Kato, Yusuke Yasuda, Xin Wang, Erica Cooper, Junichi Yamagishi
ICASSP 2021
Preprint

"Enhancing Low-Quality Voice Recordings Using Disentangled Channel Factor and Neural Waveform Model"
Haoyu Li, Yang Ai, Junichi Yamagishi
IEEE SLT 2021
Preprint, Samples

"Denoising-and-Dereverberation Hierarchical Neural Vocoder for Robust Waveform Generation"
Yang Ai, Haoyu Li, Xin Wang, Junichi Yamagishi, Zhenhua Ling
IEEE SLT 2021
Preprint, Samples


See full publications at
Google scholar, ResearchGate, Researchmap, or Edinburgh Research Explore.

Selected tutorials:
"Neural statistical parametric speech synthesis"
Xin Wang
ISCA Odyssey 2020 The Speaker and Language Recognition Workshop, Video

"Neural auto-regressive, source-filter and glottal vocoders for speech and music signals"
Junichi Yamagishi, Xin Wang
2020 Speech Processing Courses in Crete: Neural approaches for speech enhancement, synthesis, and coding, Video

"Tutorial on end-to-end text-to-speech synthesis"
Xin Wang, Yusuke Yasuda
Part 1 – Neural waveform modeling (slides)(video)
Part 2 – Tactron and related end-to-end systems(slides)(video)

See other codes developed by our group at
GitHub

Call for Papers
January 8, 2021, Computer Speech and Language Special Issue Special issue on Voice Privacy
September 30, 2019, Computer Speech and Language Special Issue Special issue on Advances in Automatic Speaker Verification Anti-spoofing
July 1, 2019, Special session at ASRU 2019 - 2019 IEEE Automatic Speech Recognition and Understanding Workshop: ASVspoof 2019: Analysing Operational Settings
May 17, 2019, SSW10 - The 10th ISCA Speech Synthesis Workshop
March 29, 2019, Special session at Interspeech 2019: The 2019 Automatic Speaker Verification Spoofing and Countermeasures Challenge: ASVspof Challenge
November 15, 2018, CSL Special issue on Speaker and language characterization and recognition: voice modeling, conversion, synthesis and ethical aspects

Call for Participants
Call for Participants: ASVspoof 2021 CHALLENGE
Call for Participants: VoicePrivacy Challenge 2020
Call for Participants: Voice Conversion Challenge 2020
Call for Participants: ASVspoof 2019 CHALLENGE: Future horizons in spoofed/fake audio detection

New databases
October 13, 2021 OpenForensics: Multi-Face Forgery Detection And Segmentation In-The-Wild Dataset
September 20, 2021, DDS (Device-Degraded Speech) Dataset (DAPS portion). (VCTK portion part 1). (VCTK portion part 2)
July 1, 2021, Complete DR-VCTK dataset
May 28, 2021, ASVspoof 2021 Challenge - Physical Access Database
May 28, 2021, ASVspoof 2021 Challenge - Speech Deepfake Database
May 28, 2021, ASVspoof 2021 Challenge - Logical Access Database
May 27, 2021, PartialSpoof Database - Partially Spoofed Audio Dataset for Anti-spoofing
May 27, 2021, Voice Conversion Challenge 2020 -- submitted waveforms v1.0.0
Jan 25, 2021, Human Perceptual Assessment Data on ASVspoof2019 LA Database
Dec 18, 2020, Voice Conversion Challenge 2020 database v1.0
Dec 18, 2020, Voice Conversion Challenge 2020 Listening Test Data
Nov 13, 2019, CSTR VCTK Corpus (version 0.92)
June 4, 2019, ASVspoof 2019: The 3rd Automatic Speaker Verification Spoofing and Countermeasures Challenge database
March 6, 2019, Alba speech corpus (Scottish female speaker, four speaking styles)

Past members
researchmap