SPE-3: Speech Synthesis 1: Architecture |
Session Type: Poster |
Time: Tuesday, 8 June, 13:00 - 13:45 |
Location: Gather.Town |
Virtual Session: View on Virtual Platform |
Session Chair: Yu Zhang, Google |
SPE-3.1: WAVE-TACOTRON: SPECTROGRAM-FREE END-TO-END TEXT-TO-SPEECH SYNTHESIS |
Ron Weiss; Google |
RJ Skerry-Ryan; Google |
Eric Battenberg; Google |
Soroosh Mariooryad; Google |
Diederik Kingma; Google |
SPE-3.2: PATNET : A PHONEME-LEVEL AUTOREGRESSIVE TRANSFORMER NETWORK FOR SPEECH SYNTHESIS |
Shiming Wang; University of Science and Technology of China |
Zhenhua Ling; University of Science and Technology of China |
Ruibo Fu; Institute of Automation, Chinese Academy of Sciences |
Jiangyan Yi; Institute of Automation, Chinese Academy of Sciences |
Jianhua Tao; Institute of Automation, Chinese Academy of Sciences |
SPE-3.3: MULTI-RATE ATTENTION ARCHITECTURE FOR FAST STREAMABLE TEXT-TO-SPEECH SPECTRUM MODELING |
Qing He; Facebook Inc |
Zhiping Xiu; Facebook Inc |
Thilo Koehler; Facebook Inc |
Jilong Wu; Facebook Inc |
SPE-3.4: END-TO-END TEXT-TO-SPEECH USING LATENT DURATION BASED ON VQ-VAE |
Yusuke Yasuda; National Institute of Informatics |
Xin Wang; National Institute of Informatics |
Junichi Yamagishi; National Institute of Informatics |
SPE-3.5: LIGHTSPEECH: LIGHTWEIGHT AND FAST TEXT TO SPEECH WITH NEURAL ARCHITECTURE SEARCH |
Renqian Luo; University of Science and Technology of China |
Xu Tan; Microsoft Research Asia |
Rui Wang; Microsoft Research Asia |
Tao Qin; Microsoft Research Asia |
Jinzhu Li; Microsoft Azure Speech |
Sheng Zhao; Microsoft Azure Speech |
Enhong Chen; University of Science and Technology of China |
Tie-Yan Liu; Microsoft Research Asia |
SPE-3.6: A NEW HIGH QUALITY TRAJECTORY TILING BASED HYBRID TTS IN REAL TIME |
Feng-Long Xie; Tencent |
Xin-Hui Li; Tencent |
Wen-Chao Su; Tencent |
Li Lu; Tencent |
Frank K. Soong; Microsoft |