The reduction on the trial frequency is
important for reinforcement learning under an actual
environment.
We propose the Q-learning method that selects proper
actions of robot in unknown environment by using the Self-
Instruction based on the experience in known environment.
Concretely, it has two Q-tables, one is smaller, based on a
partial space of the environment, the other is larger, based on
the whole space of the environment. At each learning step, Qvalues
of these Q-tables are updated at the same time, but an
action is selected by using Q-table that has smaller entropy of
Q-values at the situation. We think that the smaller Q-table is
used for the knowledge storing as self-instructing. The larger is
used for the experiment storing.
We experimented the proposed method with using an actual
mobile robot. In the experimental environment, exist a mobile
robot, two goals and one of a red, a green, a yellow and a blue
object. The robot has a task to carry a colored object into the
corresponding goal. In this experiment, the Q-table for the
whole has a state for the view of the object and the goals with
the colors, the Q-table for the partial has the state without
color information. We verified that the proposed method is
more effective than the ordinaries in an actual environment.
雑誌名
Proceedings of the Second International Workshop on Regional Innovation Studies : (IWRIS2010)
号
2
ページ
71 - 74
発行年
2011-10-01
フォーマット
application/pdf
著者版フラグ
publisher
出版者
Graduate School of Regional Innovation Studies, Mie University