Yamagishi Laboratory, National Institute of Informatics, Japan

researchmap
Associate members researchmap
Follow @yamagishilab

Selected publications and their samples/codes:
"Generating Master Faces for Use in Performing Wolf Attacks on Face Recognition Systems"
Huy H. Nguyen, Junichi Yamagishi, Isao Echizen, Sébastien Marcel
June 2020, Accepted for IJCB 2020
Preprint

"NAUTILUS: a Versatile Voice Cloning System"
Hieu-Thi Luong, Junichi Yamagishi
May 2020, Submitted to The IEEE/ACM Transactions on Audio, Speech, and Language Processing
Preprint, samples

"Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis"
Yusuke Yasuda, Xin Wang, Junichi Yamagishi
May 2020, Submitted to Computer Speech and Language
Preprint, samples

"Introducing the VoicePrivacy Initiative"
Natalia Tomashenko, Brij Mohan Lal Srivastava, Xin Wang, Emmanuel Vincent, Andreas Nautsch, Junichi Yamagishi, Nicholas Evans, Jose Patino, Jean-François Bonastre, Paul-Gauthier Noé, Massimiliano Todisco
May 2020, Submitted to Interspeech 2020
Preprint, challenge website

"The Privacy ZEBRA: Zero Evidence Biometric Recognition Assessment"
Andreas Nautsch, Jose Patino, Natalia Tomashenko, Junichi Yamagishi, Paul-Gauthier Noe, Jean-Francois Bonastre, Massimiliano Todisco, Nicholas Evans
May 2020, Submitted to Interspeech 2020
Preprint

"Design Choices for X-vector Based Speaker Anonymization"
Brij Mohan Lal Srivastava, Natalia Tomashenko, Xin Wang, Emmanuel Vincent, Junichi Yamagishi, Mohamed Maouche, Aurélien Bellet, Marc Tommasi
May 2020, Submitted to Interspeech 2020
Preprint

"Improved Prosody from Learned F0 Codebook Representations for VQ-VAE Speech Waveform Reconstruction"
Yi Zhao, Haoyu Li, Cheng-I Lai, Jennifer Williams, Erica Cooper, Junichi Yamagishi
May 2020, Submitted to Interspeech 2020
Preprint, samples, codes

"Reverberation Modeling for Source-Filter-based Neural Vocoder"
Yang Ai, Xin Wang, Junichi Yamagishi, Zhen-Hua Ling
May 2020, Submitted to Interspeech 2020
Preprint, samples

"Can Speaker Augmentation Improve Multi-Speaker End-to-End TTS?"
Erica Cooper, Cheng-I Lai, Yusuke Yasuda, Junichi Yamagishi
May 2020, Submitted to Interspeech 2020
Preprint, samples

"Noise Tokens: Learning Neural Noise Templates for Environment-Aware Speech Enhancement"
Haoyu Li, Junichi Yamagishi
April 2020, Submitted to Interspeech 2020
Preprint, samples

"Using Cyclic Noise as the Source Signal for Neural Source-Filter-based Speech Waveform Model"
Xin Wang, Junichi Yamagishi
April 2020, Submitted to Interspeech 2020
Preprint, code and samples

"iMetricGAN: Intelligibility Enhancement for Speech-in-Noise using Generative Adversarial Network-based Metric Learning"
Haoyu Li, Szu-Wei Fu, Yu Tsao, Junichi Yamagishi
April 2020, Submitted to Interspeech 2020
Preprint, code and pretrained model, samples

"Security of Facial Forensics Models Against Adversarial Attacks"
Rong Huang, Fuming Fang, Huy H. Nguyen, Junichi Yamagishi, Isao Echizen
IEEE International Conference on Image Processing (ICIP 2020)
Preprint, samples

"An initial investigation on optimizing tandem speaker verification and countermeasure systems using reinforcement learning"
Anssi Kanervisto, Ville Hautamäki, Tomi Kinnunen, Junichi Yamagishi
ISCA Odyssey 2020 The Speaker and Language Recognition Workshop
Preprint, code

"ASVspoof 2019: a large-scale public database of synthetic, converted and replayed speech"
Xin Wang, Junichi Yamagishi, Massimiliano Todisco, Hector Delgado, Andreas Nautsch, Nicholas Evans, Md Sahidullah, Ville Vestman, Tomi Kinnunen, Kong Aik Lee, Lauri Juvela, Paavo Alku, Yu-Huai Peng, Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Sebastien Le Maguer, Markus Becker, Fergus Henderson, Rob Clark, Yu Zhang, Quan Wang, Ye Jia, Kai Onuma, Koji Mushika, Takashi Kaneda, Yuan Jiang, Li-Juan Liu, Yi-Chiao Wu, Wen-Chin Huang, Tomoki Toda, Kou Tanaka, Hirokazu Kameoka, Ingmar Steiner, Driss Matrouf, Jean-Francois Bonastre, Avashna Govender, Srikanth Ronanki, Jing-Xuan Zhang, Zhen-Hua Ling
Computer Speech and Language (in press)
Preprint, project page, samples

"Generating Sentiment-Preserving Fake Online Reviews Using Neural Language Models and Their Human- and Machine-based Detection"
David Ifeoluwa Adelani, Haotian Mai, Fuming Fang, Huy H. Nguyen, Junichi Yamagishi, Isao Echizen
AINA 2020
Preprint

"Zero-Shot Multi-Speaker Text-To-Speech with State-of-the-art Neural Speaker Embeddings"
Erica Cooper, Cheng-I Lai, Yusuke Yasuda, Fuming Fang, Xin Wang, Nanxin Chen, Junichi Yamagishi
ICASSP 2020
Preprint, samples, codes (speaker encoder), codes (tacotron)

"Transferring neural speech waveform synthesizers to musical instrument sounds generation"
Yi Zhao, Xin Wang, Lauri Juvela, Junichi Yamagishi
ICASSP 2020
Preprint, samples

"Effect of choice of probability distribution, randomness, and search methods for alignment modeling in sequence-to-sequence text-to-speech synthesis using hard alignment"
Yusuke Yasuda, Xin Wang, Junichi Yamagishi
ICASSP 2020
Preprint, samples

"Bootstrapping non-parallel voice conversion from speaker-adaptive text-to-speech"
Hieu-Thi Luong, Junichi Yamagishi
IEEE ASRU 2019
Preprint, samples

"Multi-task Learning For Detecting and Segmenting Manipulated Facial Images and Videos"
Huy H. Nguyen, Fuming Fang, Junichi Yamagishi, Isao Echizen
BTAS 2019
Preprint, Demo video, Codes

"Initial investigation of an encoder-decoder end-to-end TTS framework using marginalization of monotonic hard latent alignments"
Yusuke Yasuda, Xin Wang, Junichi Yamagishi
10th ISCA Speech Synthesis Workshop (SSW10)
Preprint, samples

"Rakugo speech synthesis using segment-to-segment neural transduction and style tokens — toward speech synthesis for entertaining audiences"
Shuhei Kato, Yusuke Yasuda, Xin Wang, Erica Cooper, Shinji Takaki, Junichi Yamagishi
10th ISCA Speech Synthesis Workshop (SSW10)
Preprint, samples

"Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice Frequency for Text-to-Speech Synthesis"
Xin Wang, Junichi Yamagishi
10th ISCA Speech Synthesis Workshop (SSW10)
Preprint, samples and codes

"Speaker Anonymization Using X-vector and Neural Waveform Models"
Fuming Fang, Xin Wang, Junichi Yamagishi, Isao Echizen, Massimiliano Todisco, Nicholas Evans, Jean-Francois Bonastre
10th ISCA Speech Synthesis Workshop (SSW10)
Preprint, samples

"Neural source-filter waveform models for statistical parametric speech synthesis"
Xin Wang, Shinji Takaki, Junichi Yamagishi
IEEE/ACM Transactions on Audio, Speech, and Language Processing
Preprint, samples and codes

"ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection"
Massimiliano Todisco, Xin Wang, Ville Vestman, Md Sahidullah, Hector Delgado, Andreas Nautsch, Junichi Yamagishi, Nicholas Evans, Tomi Kinnunen, Kong Aik Lee
Interspeech 2019
Preprint, challenge website, database

"GELP: GAN-Excited Liner Prediction for Speech Synthesis from Mel-spectrogram"
Lauri Juvela, Bajibabu Bollepalli, Junichi Yamagishi, Paavo Alku
Interspeech 2019
Preprint, samples

"Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet"
Mingyang Zhang, Xin Wang, Fuming Fang, Haizhou Li, Junichi Yamagishi
Interspeech 2019
Preprint, samples

"MOSNet: Deep Learning based Objective Assessment for Voice Conversion"
Chen-Chou Lo, Szu-Wei Fu, Wen-Chin Huang, Xin Wang, Junichi Yamagishi, Yu Tsao, Hsin-Min Wang
Interspeech 2019
Preprint, Codes

"Does the Lombard Effect Improve Emotional Communication in Noise? - Analysis of Emotional Speech Acted in Noise -"
Yi Zhao, Atsushi Ando, Shinji Takaki, Junichi Yamagishi, Satoshi Kobashikawa
Interspeech 2019
Preprint, samples

"Training Multi-Speaker Neural Text-to-Speech Systems using Speaker-Imbalanced Speech Corpora"
Hieu-Thi Luong, Xin Wang, Junichi Yamagishi, Nobuyuki Nishizawa
Interspeech 2019
Preprint, samples

"Spatio-temporal generative adversarial network for gait anonymization"
Ngoc-Dung T. Tieu, Huy H. Nguyen, Hoang-Quoc Nguyen-Son, Junichi Yamagishi, Isao Echizen
Journal of Information Security and Applications
Preprint

"Introduction to Voice Presentation Attack Detection and Recent Advances"
Md Sahidullah, Hector Delgado, Massimiliano Todisco, Tomi Kinnunen, Nicholas Evans, Junichi Yamagishi, Kong-Aik Lee
a book-chapter in Handbook of Biometric Anti-Spoofing Presentation Attack Detection (Second Edition)
Preprint

"Neural source-filter-based waveform model for statistical parametric speech synthesis"
Xin Wang, Shinji Takaki, Junichi Yamagishi
ICASSP 2019
Preprint, samples and codes

"STFT spectral loss for training a neural speech waveform model"
Shinji Takaki, Toru Nakashika, Xin Wang, Junichi Yamagishi
ICASSP 2019
Preprint, samples, codes

"Capsule-Forensics: Using Capsule Networks to Detect Forged Images and Videos"
Huy H. Nguyen, Junichi Yamagishi, Isao Echizen
ICASSP 2019
Preprint, Demo video, Codes

"Attentive Filtering Networks for Audio Replay Attack Detection"
Cheng-I Lai, Alberto Abad, Korin Richmond, Junichi Yamagishi, Najim Dehak, Simon King
ICASSP 2019
Preprint, Codes

"Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language"
Yusuke Yasuda, Xin Wang, Shinji Takaki, Junichi Yamagishi
ICASSP 2019
Preprint, Codes (Tacotron with self attention), Codes (Tacotron2)

"Audiovisual speaker conversion: jointly and simultaneously transforming facial expression and acoustic characteristics"
Fuming Fang, Xin Wang, Junichi Yamagishi, Isao Echizen
ICASSP 2019
Preprint, samples

"Waveform generation for text-to-speech synthesis using pitch-synchronous multi-scale generative adversarial networks"
Lauri Juvela, Bajibabu Bollepalli, Junichi Yamagishi, Paavo Alku
ICASSP 2019
Preprint, samples, codes

See full publications at
Google scholar, ResearchGate, Researchmap, or Edinburgh Research Explore.

Selected tutorials:
"Tutorial on end-to-end text-to-speech synthesis"
Xin Wang, Yusuke Yasuda
Part 1 – Neural waveform modeling (slides)(video)
Part 2 – Tactron and related end-to-end systems(slides)(video)

See other codes developed by our group at
GitHub

Call for Papers
January 8, 2021, Computer Speech and Language Special Issue Special issue on Voice Privacy
September 30, 2019, Computer Speech and Language Special Issue Special issue on Advances in Automatic Speaker Verification Anti-spoofing
July 1, 2019, Special session at ASRU 2019 - 2019 IEEE Automatic Speech Recognition and Understanding Workshop: ASVspoof 2019: Analysing Operational Settings
May 17, 2019, SSW10 - The 10th ISCA Speech Synthesis Workshop
March 29, 2019, Special session at Interspeech 2019: The 2019 Automatic Speaker Verification Spoofing and Countermeasures Challenge: ASVspof Challenge
November 15, 2018, CSL Special issue on Speaker and language characterization and recognition: voice modeling, conversion, synthesis and ethical aspects

Call for Participants
Call for Participants: VoicePrivacy Challenge 2020
Call for Participants: Voice Conversion Challenge 2020
Call for Participants: ASVspoof 2019 CHALLENGE: Future horizons in spoofed/fake audio detection

New databases
Nov 13, 2019, CSTR VCTK Corpus (version 0.92)
June 4, 2019, ASVspoof 2019: The 3rd Automatic Speaker Verification Spoofing and Countermeasures Challenge database
March 6, 2019, Alba speech corpus (Scottish female speaker, four speaking styles)

Past members
researchmap