A method for the online construction of the set of states of a Markov decision process using answer set programming
dc.contributor.author | FERREIRA, L. A. | |
dc.contributor.author | Reinaldo Bianchi | |
dc.contributor.author | SANTOS, P. E. | |
dc.contributor.author | DE MANTARAS, R. L. | |
dc.contributor.authorOrcid | https://orcid.org/0000-0001-9097-827X | |
dc.date.accessioned | 2022-01-12T21:57:38Z | |
dc.date.available | 2022-01-12T21:57:38Z | |
dc.date.issued | 2018-06-28 | |
dc.description.abstract | © 2018, Springer International Publishing AG, part of Springer Nature.Non-stationary domains, that change in unpredicted ways, are a challenge for agents searching for optimal policies in sequential decision-making problems. This paper presents a combination of Markov Decision Processes (MDP) with Answer Set Programming (ASP), named Online ASP for MDP (oASP(MDP)), which is a method capable of constructing the set of domain states while the agent interacts with a changing environment. oASP(MDP) updates previously obtained policies, learnt by means of Reinforcement Learning (RL), using rules that represent the domain changes observed by the agent. These rules represent a set of domain constraints that are processed as ASP programs reducing the search space. Results show that oASP(MDP) is capable of finding solutions for problems in non-stationary domains without interfering with the action-value function approximation process. | |
dc.description.firstpage | 3 | |
dc.description.lastpage | 15 | |
dc.description.volume | 10868 LNAI | |
dc.identifier.citation | FERREIRA, L. A.; BIANCHI, R.; SANTOS, P. E.; DE MANTARAS, R. L. A method for the online construction of the set of states of a Markov decision process using answer set programming. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 10868 LNAI, p. 3-15, Jun. 2018. | |
dc.identifier.doi | 10.1007/978-3-319-92058-0_1 | |
dc.identifier.issn | 1611-3349 | |
dc.identifier.uri | https://repositorio.fei.edu.br/handle/FEI/3807 | |
dc.relation.ispartof | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | |
dc.rights | Acesso Restrito | |
dc.title | A method for the online construction of the set of states of a Markov decision process using answer set programming | |
dc.type | Artigo de evento | |
fei.scopus.citations | 2 | |
fei.scopus.eid | 2-s2.0-85049012571 | |
fei.scopus.subject | Answer set programming | |
fei.scopus.subject | Changing environment | |
fei.scopus.subject | Domain constraint | |
fei.scopus.subject | Finding solutions | |
fei.scopus.subject | Markov Decision Processes | |
fei.scopus.subject | Optimal policies | |
fei.scopus.subject | Sequential decision making | |
fei.scopus.subject | Value function approximation | |
fei.scopus.updated | 2024-11-01 | |
fei.scopus.url | https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85049012571&origin=inward |