A method for the online construction of the set of states of a Markov decision process using answer set programming

Nenhuma Miniatura disponível
Citações na Scopus
2
Tipo de produção
Artigo de evento
Data
2018-06-28
Autores
FERREIRA, L. A.
Reinaldo Bianchi
SANTOS, P. E.
DE MANTARAS, R. L.
Orientador
Periódico
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Título da Revista
ISSN da Revista
Título de Volume
Citação
FERREIRA, L. A.; BIANCHI, R.; SANTOS, P. E.; DE MANTARAS, R. L. A method for the online construction of the set of states of a Markov decision process using answer set programming. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 10868 LNAI, p. 3-15, Jun. 2018.
Texto completo (DOI)
Palavras-chave
Resumo
© 2018, Springer International Publishing AG, part of Springer Nature.Non-stationary domains, that change in unpredicted ways, are a challenge for agents searching for optimal policies in sequential decision-making problems. This paper presents a combination of Markov Decision Processes (MDP) with Answer Set Programming (ASP), named Online ASP for MDP (oASP(MDP)), which is a method capable of constructing the set of domain states while the agent interacts with a changing environment. oASP(MDP) updates previously obtained policies, learnt by means of Reinforcement Learning (RL), using rules that represent the domain changes observed by the agent. These rules represent a set of domain constraints that are processed as ASP programs reducing the search space. Results show that oASP(MDP) is capable of finding solutions for problems in non-stationary domains without interfering with the action-value function approximation process.

Coleções