A method for the online construction of the set of states of a Markov decision process using answer set programming

dc.contributor.authorFERREIRA, L. A.
dc.contributor.authorReinaldo Bianchi
dc.contributor.authorSANTOS, P. E.
dc.contributor.authorDE MANTARAS, R. L.
dc.contributor.authorOrcidhttps://orcid.org/0000-0001-9097-827X
dc.date.accessioned2022-01-12T21:57:38Z
dc.date.available2022-01-12T21:57:38Z
dc.date.issued2018-06-28
dc.description.abstract© 2018, Springer International Publishing AG, part of Springer Nature.Non-stationary domains, that change in unpredicted ways, are a challenge for agents searching for optimal policies in sequential decision-making problems. This paper presents a combination of Markov Decision Processes (MDP) with Answer Set Programming (ASP), named Online ASP for MDP (oASP(MDP)), which is a method capable of constructing the set of domain states while the agent interacts with a changing environment. oASP(MDP) updates previously obtained policies, learnt by means of Reinforcement Learning (RL), using rules that represent the domain changes observed by the agent. These rules represent a set of domain constraints that are processed as ASP programs reducing the search space. Results show that oASP(MDP) is capable of finding solutions for problems in non-stationary domains without interfering with the action-value function approximation process.
dc.description.firstpage3
dc.description.lastpage15
dc.description.volume10868 LNAI
dc.identifier.citationFERREIRA, L. A.; BIANCHI, R.; SANTOS, P. E.; DE MANTARAS, R. L. A method for the online construction of the set of states of a Markov decision process using answer set programming. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 10868 LNAI, p. 3-15, Jun. 2018.
dc.identifier.doi10.1007/978-3-319-92058-0_1
dc.identifier.issn1611-3349
dc.identifier.urihttps://repositorio.fei.edu.br/handle/FEI/3807
dc.relation.ispartofLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.rightsAcesso Restrito
dc.titleA method for the online construction of the set of states of a Markov decision process using answer set programming
dc.typeArtigo de evento
fei.scopus.citations2
fei.scopus.eid2-s2.0-85049012571
fei.scopus.subjectAnswer set programming
fei.scopus.subjectChanging environment
fei.scopus.subjectDomain constraint
fei.scopus.subjectFinding solutions
fei.scopus.subjectMarkov Decision Processes
fei.scopus.subjectOptimal policies
fei.scopus.subjectSequential decision making
fei.scopus.subjectValue function approximation
fei.scopus.updated2024-11-01
fei.scopus.urlhttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85049012571&origin=inward
Arquivos
Coleções