Improving cognitive agent decision making: Experience trajectories as plans

Pfau, Jens; Karim, Samin; Kirley, Michael; Sonenberg, Liz

doi:10.3233/WIA-140296

Improving cognitive agent decision making: Experience trajectories as plans

Article type: Research Article

Authors: Pfau, Jens | Karim, Samin | Kirley, Michael | Sonenberg, Liz^;

Affiliations: CGI Space, 64295 Darmstadt, Germany. E-mail: [email protected] | Department of Computing and Information Systems, University of Melbourne, 3010, Victoria, Australia. E-mail: {karims,mkirley,l.sonenberg}@unimelb.edu.au

Note: [] Corresponding author.

Abstract: In task environments with large state and action spaces, the use of temporal and state abstraction can potentially improve the decision making performance of agents. However, existing approaches within a reinforcement learning framework typically identify possible subgoal states and instantly learn stochastic subpolicies to reach them from other states. In these circumstances, exploration of the reinforcement learner is unfavorably biased towards local behavior around these subgoals; temporal abstractions are not exploited to reduce required deliberation; and the benefit of employing temporal abstractions is conflated with the benefit of additional learning done to define subpolicies. In this paper, we consider a cognitive agent architecture that allows for the extraction and reuse of temporal abstractions in the form of experience trajectories from a bottom-level reinforcement learning module and a top-level module based on the BDI (Belief-Desire-Intention) model. Here, the reuse of trajectories depends on the situation in which their recording was started. We investigate the efficacy of our approach using two well-known domains – the pursuit and the taxi domains. Detailed simulation experiments demonstrate that the use of experience trajectories as plans acquired at runtime can reduce the amount of decision making without significantly affecting asymptotic performance. The combination of temporal and state abstraction leads to improved performance during the initial learning of the reinforcement learner. Our approach can significantly reduce the number of deliberations required.

Keywords: Agent decision making, reinforcement learning, plan learning, temporal abstraction, state abstraction

DOI: 10.3233/WIA-140296

Journal: Web Intelligence and Agent Systems: An International Journal, vol. 12, no. 3, pp. 267-287, 2014

Published: 2014

Price: EUR 27.50

North America

IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA

Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]

For editorial issues, like the status of your submitted paper or proposals, write to [email protected]

Europe

IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands

Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]

For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]

Asia

Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China

Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]

For editorial issues, like the status of your submitted paper or proposals, write to [email protected]

如果您在出版方面需要帮助或有任何建, 件至: [email protected]

Share this:

North America

Europe

Asia