ログイン
言語:

WEKO3

  • トップ
  • ランキング
To
lat lon distance
To

Field does not validate



インデックスリンク

インデックスツリー

メールアドレスを入力してください。

WEKO

One fine body…

WEKO

One fine body…

アイテム

  1. 60 地域イノベーション学研究科
  2. 60C 紀要
  3. Proceedings of the International Workshop on Regional Innovation Studies
  4. 2(2010)

Accelerate Learning Processes by Avoiding Inappropriate Rules in Transfer Learning for Actor-Critic

http://hdl.handle.net/10076/11661
http://hdl.handle.net/10076/11661
e2717c9b-500b-4ee5-b20f-9a98aba2f30f
名前 / ファイル ライセンス アクション
60C15235.pdf 60C15235.pdf (106.1 kB)
Item type 紀要論文 / Departmental Bulletin Paper(1)
公開日 2011-11-08
タイトル
タイトル Accelerate Learning Processes by Avoiding Inappropriate Rules in Transfer Learning for Actor-Critic
言語 en
言語
言語 eng
キーワード
主題Scheme Other
主題 Reinforcement learning / actor-critic method / Transfer learning
資源タイプ
資源タイプ識別子 http://purl.org/coar/resource_type/c_6501
資源タイプ departmental bulletin paper
著者 TAKANO, Toshiaki

× TAKANO, Toshiaki

en TAKANO, Toshiaki

Search repository
TAKASE, Haruhiko

× TAKASE, Haruhiko

en TAKASE, Haruhiko

Search repository
KAWANAKA, Hiroharu

× KAWANAKA, Hiroharu

en KAWANAKA, Hiroharu

Search repository
TSURUOKA, Shinji

× TSURUOKA, Shinji

en TSURUOKA, Shinji

Search repository
抄録
内容記述タイプ Abstract
内容記述 This paper aims to accelerate processes
of actor-critic method, which is one of major
reinforcement learning algorithms, by a transfer
learning. In general, reinforcement learning is used
to solve optimization problems. Learning agents
acquire a policy to accomplish the target task autonomously.
To solve the problems, agents require
long learning processes for trial and error. Transfer
learning is one of effective methods to accelerate
learning processes of machine learning algorithms.
It accelerates learning processes by using
prior knowledge from a policy for a source task. We
propose an effective transfer learning algorithm for
actor-critic method. Two basic issues for the transfer
learning are method to select an effective source
policy and method to reuse without negative transfer.
In this paper, we mainly discuss the latter. We proposed
the reuse method which based on the selection
method that uses the forbidden rule set. Forbidden
rule set is the set of rules that cause immediate failure
of tasks. It is used to foresee similarity between
a source policy and the target policy. Agents should
not transfer the inappropriate rules in the selected
policy. In actor-critic, a policy is constructed by two
parameter sets: action preferences and state values.
To avoid inappropriate rules, agents reuse only reliable
action preferences and state values that imply
preferred actions. We perform simple experiments
to show the effectiveness of the proposed method. In
conclusion, the proposed method accelerates learning
processes for the target tasks.
書誌情報 Proceedings of the Second International Workshop on Regional Innovation Studies : (IWRIS2010)

号 2, p. 55-58, 発行日 2011-10-01
フォーマット
内容記述タイプ Other
内容記述 application/pdf
著者版フラグ
出版タイプ VoR
出版タイプResource http://purl.org/coar/version/c_970fb48d4fbd8a85
出版者
出版者 Graduate School of Regional Innovation Studies, Mie University
資源タイプ(三重大)
値 Departmental Bulletin Paper / 紀要論文
戻る
0
views
See details
Views

Versions

Ver.1 2023-06-19 18:11:53.133720
Show All versions

Share

Mendeley Twitter Facebook Print Addthis

Cite as

エクスポート

OAI-PMH
  • OAI-PMH JPCOAR 2.0
  • OAI-PMH JPCOAR 1.0
  • OAI-PMH DublinCore
  • OAI-PMH DDI
Other Formats
  • JSON
  • BIBTEX

Confirm


Powered by WEKO3


Powered by WEKO3