Affiliations: Dept. of Computer Science and Engineering, Waseda University, Shinjuku, Tokyo 169-8555, Japan | National Institute of Informatics, Chiyoda, Tokyo 101-8430, Japan | NTT Network Innovation Laboratories, Musashino, Tokyo 180-8585, Japan | Faculty of Information Science, Hosei University, Koganei, Tokyo 184-8584, Japan | The Institute of Scientific and Industrial Research, Osaka University, Osaka 567-0047, Japan
Note: [] This paper is the revised and extended version of our conference papers [21] and [22].
Abstract: This paper describes how, in large-scale multi-agent systems, each agent's adaptive selection of peer agents for collaborative tasks affects the overall performance and how this performance varies with the workload of the system and with fluctuations in the agents' peer selection policies (PSP). An intelligent agent in a multi-agent system (MAS) often has to select appropriate agents to assign tasks that cannot be executed locally. These collaborating agents are usually chosen according to their skills. However, if multiple candidate peer agents still remain a more efficient agent is preferable. Of course, its efficiency is affected by the agent' workload and CPU performance and the available communication bandwidth. Unfortunately, as no agent in an open environment such as the Internet can obtain such data from any other agent, this selection must be done according to the available local information about the other known agents. However, this information is limited, usually uncertain and often obsolete. Agents' states may also change over time, so the PSP must be adaptive to some extent. We investigated how the overall performance of MAS would change under adaptive policies in which agents selects peer agents using statistical/reinforcement learning. We particularly focused on mutual interference for selection under different workloads, that is, underloaded, near-critical, and overloaded situations. This paper presents simulation results and shows that the overall performance of MAS highly depends on the workload. It is shown that when agents' workloads are near the limit of theoretical total capability, a greedy PSP degrades the overall performance, even after a sufficient learning time, but that a PSP with a little fluctuation, called fluctuated PSP, can considerably improve it.