Junichi Yamagishi

Professor

National Institute of Informatics

Latest

OpenForensics: Large-Scale Challenging Dataset for Multi-Face Forgery Detection and Segmentation In-the-Wild
Revisiting Speech Content Privacy
A Multi-Level Attention Model for Evidence-Based Fact Checking
A Comparative Study on Recent Neural Spoofing Countermeasures for Synthetic Speech Detection
An Initial Investigation for Detecting Partially Spoofed Audio
ASVspoof 2019: Spoofing Countermeasures for the Detection of Synthesized, Converted and Replayed Speech
ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection
Capsule-Forensics Networks for Deepfake Detection
Enhancing Low-Quality Voice Recordings Using Disentangled Channel Factor and Neural Waveform Model
Exploring Disentanglement with Multilingual and Monolingual VQ-VAE
How do Voices from Past Speech Synthesis Challenges Compare Today?
How Similar or Different is Rakugo Speech Synthesizer to Professional Performers?
Learning Disentangled Phone and Speaker Representations in a Semi-Supervised VQ-VAE Paradigm
Multi-Metric Optimization Using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement
Multi-Task Learning in Utterance-Level and Segmental-Level Spoof Detection
Text-to-Speech Synthesis Techniques for MIDI-to-Audio Synthesis
Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-Spoofing
Viable Threat on News Reading: Generating Biased News Using Natural Language Models
ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech
Can Speaker Augmentation Improve Multi-Speaker End-to-End TTS?
Design Choices for X-Vector Based Speaker Anonymization
Generating Master Faces for Use in Performing Wolf Attacks on Face Recognition Systems
Generating Sentiment-Preserving Fake Online Reviews Using Neural Language Models and Their Human- and Machine-Based Detection
iMetricGAN: Intelligibility Enhancement for Speech-in-Noise Using Generative Adversarial Network-Based Metric Learning
Introducing the VoicePrivacy Initiative
Modeling of Rakugo Speech and Its Limitations: Toward Speech Synthesis That Entertains Audiences
NAUTILUS: A Versatile Voice Cloning System
Neural Source-Filter Waveform Models for Statistical Parametric Speech Synthesis
Noise Tokens: Learning Neural Noise Templates for Environment-Aware Speech Enhancement
Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions
Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals
The Privacy ZEBRA: Zero Evidence Biometric Recognition Assessment
Transferring Neural Speech Waveform Synthesizers to Musical Instrument Sounds Generation
Using Cyclic Noise as the Source Signal for Neural Source-Filter-Based Speech Waveform Model
Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion
Zero-Shot Multi-Speaker Text-To-Speech with State-Of-The-Art Neural Speaker Embeddings
ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection
Bootstrapping Non-Parallel Voice Conversion from Speaker-Adaptive Text-to-Speech
Capsule-forensics: Using Capsule Networks to Detect Forged Images and Videos
Investigation of Enhanced Tacotron Text-to-speech Synthesis Systems with Self-attention for Pitch Accent Language
Joint Training Framework for Text-to-Speech and Voice Conversion Using Multi-Source Tacotron and WaveNet
MOSNet: Deep Learning based Objective Assessment for Voice Conversion
Multi-task Learning for Detecting and Segmenting Manipulated Facial Images and Videos
Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice Frequency for Text-to-Speech Synthesis
Neural Source-filter-based Waveform Model for Statistical Parametric Speech Synthesis
Rakugo speech synthesis using segment-to-segment neural transduction and style tokens --- toward speech synthesis for entertaining audiences
Speaker Anonymization Using X-vector and Neural Waveform Models