DECAF: Deep Case-based Policy Inference for Knowledge Transfer in Reinforcement Learning

GLATT, RUBEN; SILVA, FELIPE LENO DA; Reinaldo Bianchi; COSTA, ANNA HELENA REALI

DECAF: Deep Case-based Policy Inference for Knowledge Transfer in Reinforcement Learning

Citações na Scopus

19

Tipo de produção

Artigo

Data

2020-10-15

Autores

GLATT, RUBEN
SILVA, FELIPE LENO DA
Reinaldo Bianchi
COSTA, ANNA HELENA REALI

Periódico

EXPERT SYSTEMS WITH APPLICATIONS

Citação

GLATT, R.; S., F. L. DA; BIANCHI, R. A. DA C.; COSTA, A. H R. DECAF: Deep Case-based Policy Inference for Knowledge Transfer in Reinforcement Learning. EXPERT SYSTEMS WITH APPLICATIONS, v. 1, p. 113420, 2020.

Texto completo (DOI)

10.1016/j.eswa.2020.113420

Palavras-chave

Deep Reinforcement Learning,Case-based Reasoning,Transfer Learning,Knowledge discovery,Knowledge management,Neural networks

URI

https://repositorio.fei.edu.br/handle/FEI/3471

Resumo

Having the ability to solve increasingly complex problems using Reinforcement Learning (RL) has prompted researchers to start developing a greater interest in systematic approaches to retain and reuse knowledge over a variety of tasks. With Case-based Reasoning (CBR) there exists a general methodology that provides a framework for knowledge transfer which has been underrepresented in the RL literature so far. We for- mulate a terminology for the CBR framework targeted towards RL researchers with the goal of facilitating communication between the respective research communities. Based on this framework, we propose the Deep Case-based Policy Inference (DECAF) algorithm to accelerate learning by building a library of cases and reusing them if they are similar to a new task when training a new policy. DECAF guides the train- ing by dynamically selecting and blending policies according to their usefulness for the current target task, reusing previously learned policies for a more effective exploration but still enabling the adaptation to particularities of the new task. We show an empirical evaluation in the Atari game playing domain depicting the benefits of our algorithm with regards to sample efficiency, robustness against negative transfer, and performance increase when compared to state-of-the-art methods.

Coleções

Artigos

Página do item completo