Reinforcement learning with prediction-based rewards
Weβve developedΒ Random Network Distillation (RND), a prediction-based method for encouraging reinforcement learning agents to explore their environments through curiosity, which for the first time exceeds average human performance onΒ Montezumaβs Revenge.
Log in to bookmark articles and create collections