DECAF: Deep Case-based Policy Inference for Knowledge Transfer in Reinforcement Learning

Nenhuma Miniatura disponível
Citações na Scopus
19
Tipo de produção
Artigo
Data
2020-10-15
Autores
GLATT, RUBEN
SILVA, FELIPE LENO DA
Reinaldo Bianchi
COSTA, ANNA HELENA REALI
Orientador
Periódico
EXPERT SYSTEMS WITH APPLICATIONS
Título da Revista
ISSN da Revista
Título de Volume
Citação
GLATT, R.; S., F. L. DA; BIANCHI, R. A. DA C.; COSTA, A. H R. DECAF: Deep Case-based Policy Inference for Knowledge Transfer in Reinforcement Learning. EXPERT SYSTEMS WITH APPLICATIONS, v. 1, p. 113420, 2020.
Texto completo (DOI)
Palavras-chave
Deep Reinforcement Learning,Case-based Reasoning,Transfer Learning,Knowledge discovery,Knowledge management,Neural networks
Resumo
Having the ability to solve increasingly complex problems using Reinforcement Learning (RL) has prompted researchers to start developing a greater interest in systematic approaches to retain and reuse knowledge over a variety of tasks. With Case-based Reasoning (CBR) there exists a general methodology that provides a framework for knowledge transfer which has been underrepresented in the RL literature so far. We for- mulate a terminology for the CBR framework targeted towards RL researchers with the goal of facilitating communication between the respective research communities. Based on this framework, we propose the Deep Case-based Policy Inference (DECAF) algorithm to accelerate learning by building a library of cases and reusing them if they are similar to a new task when training a new policy. DECAF guides the train- ing by dynamically selecting and blending policies according to their usefulness for the current target task, reusing previously learned policies for a more effective exploration but still enabling the adaptation to particularities of the new task. We show an empirical evaluation in the Atari game playing domain depicting the benefits of our algorithm with regards to sample efficiency, robustness against negative transfer, and performance increase when compared to state-of-the-art methods.

Coleções