2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper IDMLSP-27.5
Paper Title POLICY AUGMENTATION: AN EXPLORATION STRATEGY FOR FASTER CONVERGENCE OF DEEP REINFORCEMENT LEARNING ALGORITHMS
Authors Arash Mahyari, Florida Institute For Human and Machine Cognition (IHMC), United States
SessionMLSP-27: Reinforcement Learning 3
LocationGather.Town
Session Time:Thursday, 10 June, 13:00 - 13:45
Presentation Time:Thursday, 10 June, 13:00 - 13:45
Presentation Poster
Topic Machine Learning for Signal Processing: [MLR-SLER] Sequential learning; sequential decision methods
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Virtual Presentation  Click here to watch in the Virtual Conference
Abstract Despite advancements in deep reinforcement learning algorithms, developing an effective exploration strategy is still an open problem. Most existing exploration strategies either are based on simple heuristics, or require the model of the environment, or train additional deep neural networks to generate imagination-augmented paths. In this paper, a revolutionary algorithm, called \textit{Policy Augmentation}, is introduced. \textit{Policy Augmentation} is based on a newly developed inductive matrix completion method. The proposed algorithm augments the values of unexplored state-action pairs, helping the agent take actions that will result in high-value returns while the agent is in the early episodes. Training deep reinforcement learning algorithms with high-value rollouts leads to the faster convergence of deep reinforcement learning algorithms. Our experiments show the superior performance of \textit{Policy Augmentation}. The code can be found at: https://github.com/arashmahyari/PolicyAugmentation.