2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

IEEE Signal Processing Society

Institute of Electrical and Electronics Engineers (IEEE)

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper ID	MLSP-25.6
Paper Title	ROBUST MAML: PRIORITIZATION TASK BUFFER WITH ADAPTIVE LEARNING PROCESS FOR MODEL-AGNOSTIC META-LEARNING
Authors	Thanh Nguyen, Tung Luu, Trung Pham, Sanzhar Rakhimkul, Chang Dong Yoo, Korea Advanced Institute of Science and Technology (KAIST), South Korea
Session	MLSP-25: Reinforcement Learning 1
Location	Gather.Town
Session Time:	Thursday, 10 June, 13:00 - 13:45
Presentation Time:	Thursday, 10 June, 13:00 - 13:45
Presentation	Poster
Topic	Machine Learning for Signal Processing: [MLR-REI] Reinforcement learning
IEEE Xplore Open Preview	Click here to view in IEEE Xplore
Virtual Presentation	Click here to watch in the Virtual Conference
Abstract	Model agnostic meta-learning (MAML) is a popular state-of-the-art meta-learning algorithm that provides good weight initialization of a model given a variety of learning tasks. The model initialized by provided weight can be fine-tuned to an unseen task despite only using a small amount of samples and within a few adaptation steps. MAML is simple and versatile but requires costly learning rate tuning and careful design of the task distribution which affects its scalability and generalization. This paper proposes a more robust MAML based on an adaptive learning scheme and a prioritization task buffer (PTB) referred to as Robust MAML (RMAML) for improving scalability of training process and alleviating the problem of distribution mismatch. RMAML uses gradient-based hyper-parameter optimization to automatically find the optimal learning rate and uses the PTB to gradually adjust training task distribution toward testing task distribution over the course of training. Experimental results on meta reinforcement learning environments demonstrate a substantial performance gain as well as being less sensitive to hyper-parameter choice and robust to distribution mismatch.