Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Taherian, Nahid | Shiri, Mohammad Ebrahim; *
Affiliations: Amirkabir University of Technology, Tehran, Iran
Correspondence: [*] Corresponding author: Mohammad Ebrahim Shiri, Amirkabir University of Technology, Tehran, Iran. Tel.: +98 21 6454 2548; E-mail: [email protected].
Abstract: State abstraction and value function approximation are powerful and useful methods for time and memory management in reinforcement learning. In traditional trends, these methods are applied to speed up learning of the current task; however, when we learn multiple similar environments in a general setting, these methods can be applied to improve learning of other tasks. We propose a framework to aggregate the results of state abstraction and function approximation in several tasks of a domain to reuse them in future tasks of that domain. First, we show theoretically how abstraction based on optimal value functions speeds up learning in that same task in the future. In many situations, fuzzy clustering is more natural than hard clustering, since it does not force the states to fully belong to one of the classes. In second part, we examine theoretically and algorithmically how using the knowledge extracted by fuzzy value approximation of a single task improves learning of that same task in the future. In both parts, we show that aggregating states (hard or fuzzy) preserves the optimal value function in the abstract space with an error bound. Having the support provided by these two parts, we propose new ways to combine the results of abstraction and approximation of different tasks of the domain to infer similarity measures on the state space. Finally, we show empirically that batch learning based on these similarity measures can speed up learning in the future tasks of the setting.
Keywords: Reinforcement learning, state abstraction, function approximation, transfer learning, fuzzy clustering
DOI: 10.3233/IDA-140689
Journal: Intelligent Data Analysis, vol. 18, no. 6, pp. 1153-1175, 2014
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]