2021 IEEE International Conference on Acoustics, Speech and Signal Processing

Technical Program

Paper ID	MLSP-13.1
Paper Title	CROSS-SILO FEDERATED TRAINING IN THE CLOUD WITH DIVERSITY SCALING AND SEMI-SUPERVISED LEARNING
Authors	Kishore Nandury, Anand Mohan, Frederick Weber, Amazon, India
Session	MLSP-13: Federated Learning 2
Location	Gather.Town
Session Time:	Wednesday, 09 June, 13:00 - 13:45
Presentation Time:	Wednesday, 09 June, 13:00 - 13:45
Presentation	Poster
Topic	Machine Learning for Signal Processing: [MLR-DFED] Distributed/Federated learning
IEEE Xplore Open Preview	Click here to view in IEEE Xplore
Virtual Presentation	Click here to watch in the Virtual Conference
Abstract	Federated learning is a machine learning approach that allows a loose federation of trainers to collaboratively improve a shared model, while making minimum assumptions on central availability of data. In cross-siloed federated learning, data is partitioned into silos, each with an associated trainer. This work presents results from training an end-to-end ASR model with cross-silo federated learning system. We propose a novel aggregation algorithm that takes update diversity into account and significantly outperforms Federated Averaging (FedAvg). The system design used in this paper allows joint training with human transcribed and semi-supervised (SSL) data, yielding 7.6% relative word error rate reduction on head test set and 13.9% on tail test set, when using 20kHr of SSL data. Gains further improve to 13.8% and 20.5% respectively when SSL data is increased from 20kHr to 200kHr.