SPE-33: Speech Synthesis 5: Prosody & Style |
Session Type: Poster |
Time: Thursday, 10 June, 13:00 - 13:45 |
Location: Gather.Town |
Virtual Session: View on Virtual Platform |
Session Chair: Hung-yi Lee, National Taiwan University
|
|
SPE-33.1: SPEECH BERT EMBEDDING FOR IMPROVING PROSODY IN NEURAL TTS |
Liping Chen; Microsoft |
Yan Deng; Microsoft |
Xi Wang; Microsoft |
Frank K. Soong; Microsoft |
Lei He; Microsoft |
|
SPE-33.2: BI-LEVEL STYLE AND PROSODY DECOUPLING MODELING FOR PERSONALIZED END-TO-END SPEECH SYNTHESIS |
Ruibo Fu; National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences |
Jianhua Tao; National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences |
Zhengqi Wen; National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences |
Jiangyan Yi; National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences |
Tao Wang; National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences |
Chunyu Qiang; National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences |
|
SPE-33.3: PROSODIC REPRESENTATION LEARNING AND CONTEXTUAL SAMPLING FOR NEURAL TEXT-TO-SPEECH |
Sri Karlapati; Amazon |
Ammar Abbas; Amazon |
Zack Hodari; University of Edinburgh |
Alexis Moinet; Amazon |
Arnaud Joly; Amazon |
Penny Karanasou; Amazon |
Thomas Drugman; Amazon |
|
SPE-33.4: CAMP: A TWO-STAGE APPROACH TO MODELLING PROSODY IN CONTEXT |
Zack Hodari; University of Edinburgh |
Alexis Moinet; Amazon |
Sri Karlapati; Amazon |
Jaime Lorenzo-Trueba; Amazon |
Thomas Merritt; Amazon |
Arnaud Joly; Amazon |
Ammar Abbas; Amazon |
Penny Karanasou; Amazon |
Thomas Drugman; Amazon |
|
SPE-33.5: UNSUPERVISED LEARNING FOR MULTI-STYLE SPEECH SYNTHESIS WITH LIMITED DATA |
Shuang Liang; Ping An Technology |
Chenfeng Miao; Ping An Technology |
Minchuan Chen; Ping An Technology |
Jun Ma; Ping An Technology |
Shaojun Wang; Ping An Technology |
Jing Xiao; Ping An Technology |
|
SPE-33.6: FASTPITCH: PARALLEL TEXT-TO-SPEECH WITH PITCH PREDICTION |
Adrian Łańcucki; NVIDIA Corporation |
|