WEKO3
アイテム
Reinforcement Learning with dual tables for a partial and a whole space
http://hdl.handle.net/10076/11665
http://hdl.handle.net/10076/1166545a43927-06d4-495a-9c35-7eadffec435b
名前 / ファイル | ライセンス | アクション |
---|---|---|
![]() |
|
Item type | 紀要論文 / Departmental Bulletin Paper(1) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
公開日 | 2011-11-08 | |||||||||
タイトル | ||||||||||
タイトル | Reinforcement Learning with dual tables for a partial and a whole space | |||||||||
言語 | en | |||||||||
言語 | ||||||||||
言語 | eng | |||||||||
キーワード | ||||||||||
主題Scheme | Other | |||||||||
主題 | Reinforcement learning | |||||||||
資源タイプ | ||||||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_6501 | |||||||||
資源タイプ | departmental bulletin paper | |||||||||
著者 |
Shibata, Nobuo
× Shibata, Nobuo
× Matsui, Hirokazu
|
|||||||||
抄録 | ||||||||||
内容記述タイプ | Abstract | |||||||||
内容記述 | The reduction on the trial frequency is important for reinforcement learning under an actual environment. We propose the Q-learning method that selects proper actions of robot in unknown environment by using the Self- Instruction based on the experience in known environment. Concretely, it has two Q-tables, one is smaller, based on a partial space of the environment, the other is larger, based on the whole space of the environment. At each learning step, Qvalues of these Q-tables are updated at the same time, but an action is selected by using Q-table that has smaller entropy of Q-values at the situation. We think that the smaller Q-table is used for the knowledge storing as self-instructing. The larger is used for the experiment storing. We experimented the proposed method with using an actual mobile robot. In the experimental environment, exist a mobile robot, two goals and one of a red, a green, a yellow and a blue object. The robot has a task to carry a colored object into the corresponding goal. In this experiment, the Q-table for the whole has a state for the view of the object and the goals with the colors, the Q-table for the partial has the state without color information. We verified that the proposed method is more effective than the ordinaries in an actual environment. |
|||||||||
書誌情報 |
Proceedings of the Second International Workshop on Regional Innovation Studies : (IWRIS2010) 号 2, p. 71-74, 発行日 2011-10-01 |
|||||||||
フォーマット | ||||||||||
内容記述タイプ | Other | |||||||||
内容記述 | application/pdf | |||||||||
著者版フラグ | ||||||||||
出版タイプ | VoR | |||||||||
出版タイプResource | http://purl.org/coar/version/c_970fb48d4fbd8a85 | |||||||||
出版者 | ||||||||||
出版者 | Graduate School of Regional Innovation Studies, Mie University | |||||||||
資源タイプ(三重大) | ||||||||||
値 | Departmental Bulletin Paper / 紀要論文 |