Engenharia Elétrica

URI permanente desta comunidadehttps://repositorio.fei.edu.br/handle/FEI/21

Navegar

Resultados da Pesquisa

Agora exibindo 1 - 10 de 12

Transferring knowledge as heuristics in reinforcement learning: A case-based approach
(2015) Bianchi R.A.C.; Celiberto L.A.; Santos P.E.; Matsuura J.P.; Lopez De Mantaras R.
© 2015 Elsevier B.V.Abstract The goal of this paper is to propose and analyse a transfer learning meta-algorithm that allows the implementation of distinct methods using heuristics to accelerate a Reinforcement Learning procedure in one domain (the target) that are obtained from another (simpler) domain (the source domain). This meta-algorithm works in three stages: first, it uses a Reinforcement Learning step to learn a task on the source domain, storing the knowledge thus obtained in a case base; second, it does an unsupervised mapping of the source-domain actions to the target-domain actions; and, third, the case base obtained in the first stage is used as heuristics to speed up the learning process in the target domain. A set of empirical evaluations were conducted in two target domains: the 3D mountain car (using a learned case base from a 2D simulation) and stability learning for a humanoid robot in the Robocup 3D Soccer Simulator (that uses knowledge learned from the Acrobot domain). The results attest that our transfer learning algorithm outperforms recent heuristically-accelerated reinforcement learning and transfer learning algorithms.
Perception, cognition and reasoning about shadows
(2018) Santos P.E.; Casati R.; Cavanagh P.
To ban or regulate autonomous weapons: A Brazilian response
(2016) Santos P.E.
Answer set programming for non-stationary Markov decision processes
(2017) Ferreira L.A.; C. Bianchi R.A.; Santos P.E.; de Mantaras R.L.
© 2017, Springer Science+Business Media New York.Non-stationary domains, where unforeseen changes happen, present a challenge for agents to find an optimal policy for a sequential decision making problem. This work investigates a solution to this problem that combines Markov Decision Processes (MDP) and Reinforcement Learning (RL) with Answer Set Programming (ASP) in a method we call ASP(RL). In this method, Answer Set Programming is used to find the possible trajectories of an MDP, from where Reinforcement Learning is applied to learn the optimal policy of the problem. Results show that ASP(RL) is capable of efficiently finding the optimal solution of an MDP representing non-stationary domains.
Heuristically Accelerated Reinforcement Learning by Means of Case-Based Reasoning and Transfer Learning
(2018) Bianchi R.A.C.; Santos P.E.; da Silva I.J.; Celiberto L.A.; Lopez de Mantaras R.
© 2017, Springer Science+Business Media B.V.Reinforcement Learning (RL) is a well-known technique for learning the solutions of control problems from the interactions of an agent in its domain. However, RL is known to be inefficient in problems of the real-world where the state space and the set of actions grow up fast. Recently, heuristics, case-based reasoning (CBR) and transfer learning have been used as tools to accelerate the RL process. This paper investigates a class of algorithms called Transfer Learning Heuristically Accelerated Reinforcement Learning (TLHARL) that uses CBR as heuristics within a transfer learning setting to accelerate RL. The main contributions of this work are the proposal of a new TLHARL algorithm based on the traditional RL algorithm Q(λ) and the application of TLHARL on two distinct real-robot domains: a robot soccer with small-scale robots and the humanoid-robot stability learning. Experimental results show that our proposed method led to a significant improvement of the learning rate in both domains.
A qualitative spatial representation of string loops as holes
(2016) Cabalar P.; Santos P.E.
© 2016 Published by Elsevier B.V.This research note contains an extension of a previous work by Cabalar and Santos (2011) that formalised several spatial puzzles formed by strings and holes. That approach explicitly ignored some configurations and actions that were irrelevant for the studied puzzles but are physically possible and may become crucial for other spatial reasoning problems. In particular, the previous work did not consider the formation of string loops or the situations where a holed object is partially crossed by another holed object. In this paper, we remove these limitations by treating string loops as dynamic holes that can be created or destroyed by a pair of elementary actions, respectively picking or pulling from strings. We explain how string loops can be recognised in a data structure representing the domain states and define a notation to represent crossings through string loops. The resulting formalism is dual in the sense that it also allows understanding any hole as a kind of (sometimes rigid) closed string loop.
Protocols from perceptual observations
(2005) Needham C.J.; Santos P.E.; Magee D.R.; Devin V.; Hogg D.C.; Cohn A.G.
This paper presents a cognitive vision system capable of autonomously learning protocols from perceptual observations of dynamic scenes. The work is motivated by the aim of creating a synthetic agent that can observe a scene containing interactions between unknown objects and agents, and learn models of these sufficient to act in accordance with the implicit protocols present in the scene. Discrete concepts (utterances and object properties), and temporal protocols involving these concepts, are learned in an unsupervised manner from continuous sensor input alone. Crucial to this learning process are methods for spatio-temporal attention applied to the audio and visual sensor data. These identify subsets of the sensor data relating to discrete concepts. Clustering within continuous feature spaces is used to learn object property and utterance models from processed sensor data, forming a symbolic description. The progol Inductive Logic Programming system is subsequently used to learn symbolic models of the temporal protocols presented in the presence of noise and over-representation in the symbolic data input to it. The models learned are used to drive a synthetic agent that can interact with the world in a semi-natural way. The system has been evaluated in the domain of table-top game playing and has been shown to be successful at learning protocol behaviours in such real-world audio-visual environments. © 2005 Elsevier B.V. All rights reserved.
The space within fisherman's folly: Playing with a puzzle in mereotopology
(2008) Santos P.E.; Cabalar P.
In this paper we propose a spatial ontology for reasoning about holes, rigid objects and a string, taking a classical puzzle as a motivating example. In this ontology the domain is composed of spatial regions whereby a theory about holes is defined over a mereotopological basis. Within this theory we define a data structure, named chain, that facilitates a clear and efficient representation of the puzzle states and its solution. © Taylor & Francis Group, LLC.
The perception and content of cast shadows: An interdisciplinary review
(2011) Dee H.M.; Santos P.E.
Recently, psychologists have turned their attention to the study of cast shadows and demonstrated that the human perceptual system values information from shadows very highly in the perception of spatial qualities, sometimes to the detriment of other cues. However with some notable and recent exceptions, computer vision systems treat cast shadows not as signal but as noise. This paper provides a concise yet comprehensive review of the literature on cast shadow perception from across the cognitive sciences, including the theoretical information available, the perception of shadows in human and machine vision, and the ways in which shadows can be used. © Taylor & Francis Group, LLC.
Reasoning about depth and motion from an observer's viewpoint
(2007) Santos P.E.
The goal of this paper is to present a logic-based formalism for representing knowledge about objects in space and their movements, and show how this knowledge could be built up from the viewpoint of an observer immersed in a dynamic world. In this paper space is represented using functions that extract attributes of depth, size and distance from snapshots of the world. These attributes compose a novel spatial reasoning system named Depth Profile Calculus (DPC). Transitions between qualitative relations involving these attributes are represented by an extension of this calculus called Dynamic Depth Profile Calculus (DDPC). We argue that knowledge about objects in the world could be built up via a process of abduction on DDPC relations. © 2007, Lawrence Erlbaum Associates, Inc.

Engenharia Elétrica

Navegar

Filtros

Configurações

Ordenar por

Resultados por página

Resultados da Pesquisa