Heuristically accelerated reinforcement learning: Theoretical and experimental results

dc.contributor.authorReinaldo Bianchi
dc.contributor.authorRIBEIRO, C. H. C.
dc.contributor.authorCOSTA, A. H. R.
dc.date.accessioned2022-01-12T22:02:40Z
dc.date.available2022-01-12T22:02:40Z
dc.date.issued2012-08-05
dc.description.abstractSince finding control policies using Reinforcement Learning (RL) can be very time consuming, in recent years several authors have investigated how to speed up RL algorithms by making improved action selections based on heuristics. In this work we present new theoretical results - convergence and a superior limit for value estimation errors - for the class that encompasses all heuristics-based algorithms, called Heuristically Accelerated Reinforcement Learning. We also expand this new class by proposing three new algorithms, the Heuristically Accelerated Q(λ), SARSA(λ) and TD(λ), the first algorithms that uses both heuristics and eligibility traces. Empirical evaluations were conducted in traditional control problems and results show that using heuristics significantly enhances the performance of the learning process. © 2012 The Author(s).
dc.description.firstpage169
dc.description.lastpage174
dc.description.volume242
dc.identifier.citationBIANCHI, R.; RIBEIRO, C. H. C.; COSTA, A. H. R. Heuristically accelerated reinforcement learning: Theoretical and experimental results. Frontiers in Artificial Intelligence and Applications, v. 242, p. 169-174, Aug. 2012.
dc.identifier.doi10.3233/978-1-61499-098-7-169
dc.identifier.issn0922-6389
dc.identifier.urihttps://repositorio.fei.edu.br/handle/FEI/4151
dc.relation.ispartofFrontiers in Artificial Intelligence and Applications
dc.rightsAcesso Restrito
dc.titleHeuristically accelerated reinforcement learning: Theoretical and experimental results
dc.typeArtigo de evento
fei.scopus.citations21
fei.scopus.eid2-s2.0-84878810338
fei.scopus.subjectAction selection
fei.scopus.subjectControl policy
fei.scopus.subjectControl problems
fei.scopus.subjectEligibility traces
fei.scopus.subjectEmpirical evaluations
fei.scopus.subjectLearning process
fei.scopus.subjectSpeed up
fei.scopus.subjectValue estimation
fei.scopus.updated2024-07-01
fei.scopus.urlhttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84878810338&origin=inward
Arquivos
Coleções