Affiliations: Information Sciences Institute, University of Southern
California, 4676 Admiralty Way, Suite1001, Marina Del Rey, CA 90292, USA.
E-mail: {sdlin, knoblock}@isi.edu
Abstract: With the rapid growth of the World Wide Web, there is growing
interest in developing web agents that interact with online services to acquire
information. However, finding the online services perfectly suited for a given
task is not always feasible. First, the agents might not be given sufficient
information to fill in the required input fields for querying an online
service. Second, the online service might generate only partial information.
Third, the agents might need to know the information about B by some input set
A, but they can only find the online services that generate A from B. Fourth,
most of the online services do not tolerate errors in the inputs, thus even a
minor typo in the input field can hinder them from generating any meaningful
results. This paper proposes SERGEANT, a framework for
building flexible web agents that handle these imperfect situations. In this
framework we exploit an information retrieval (IR) system as a general
discovery tool to assist finding and pruning information. To demonstrate
SERGEANT, we implemented two web agents: the Internet inverse geocoder and the
address lookup module. Our experiments show that these agents are capable of
generating high-quality results under imperfect situations.