Источник: http://agents.felk.cvut.cz/cgi-bin/docarc/public.pl/document/309/SMCsolver.pdf

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICSóPART C: APPLICATIONS AND REVIEWS, VOL. 41, NO. 1, JANUARY 2011 31

Abstract Architecture for Task-oriented Multi-agent Problem Solving

A. Komenda, M. Pechoucek

AbstractóProblem solving and planning in decentralized envi- ronments is a key technical challenge in numerous industrial appli- cations, ranging from manufacturing, logistics, virtual enterprizes to multirobotics systems. We present an abstract architecture of a multiagent solver and respective algorithm providing decomposi- tion, task allocation, and task delegation. Various features of the abstract architecture, such as computational complexity or admis- sibility of the underlying optimization heuristics, are analyzed in the paper. Four instances of the abstract architecture implemen- tations are given to demonstrate the applicability of the abstract solver in a wide variety of real-problem domains.

Index TermsóAlgorithms, complexity, distributed planning and problem solving, multiagent applications, multiagent systems, task allocation, task delegation.

I. INTRODUCTION

THE PROBLEM of distributed decision making, decentral- ized planning, and controlling entities in heterogeneous distributed environments is crucial for many application do- mains [1]. Existing centralized methods depend on one central- planning system that gathers all required data about decentral- ized entities before the planning process starts. This approach faces various difficulties. One problem is the need for shar- ing private knowledge and information about the actual status of these entities. The other problem is the need for real-time replanning based on environments and conditions changing dy- namically in time. We present an approach, where each entity is in charge of suggesting their individual plans, where coopera- tion, resource sharing, and deconflition is solved by methods of

negotiation.

The problem of distributed planning and problem solving has been often discussed in the artificial intelligence planning and multiagent research communities recently (e.g., [1]ñ[4]). Distributed planning has been viewed as either 1) planning for activities and resources allocated among distributed agents; 2) distributed (parallel) computation aimed at plan construc- tion; or 3) plan-merging activity. In this paper, we are solving the problem by algorithms based on task allocation and local

Manuscript received November 19, 2009; revised May 9, 2010; accepted July 26, 2010. Date of current version December 17, 2010. This work was supported by Czech Ministry of Education, Youth and Sports under Grant 6840770038, by Student Grant of Czech Technical University in Prague under Grant SGS10- 801890, and by U.S. Army Grant W911NF-08-1-0521 and W15P7T-05-R-P209 under Contract BAA 8020902.A. This paper was recommended by Associate Editor R. W. Brennan.

The authors are with the Agent Technology Center, FEE, Czech Technical University in Prague, Prague, Czech Republic.

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TSMCC.2010.2073465

resource planning in cooperative environments and use the del- egation for continual solution improvement, which is performed in noncritical time. The paper is divided into two parts presenting

1)abstract multiagent solver, problem definition, solver archi- tecture, and algorithms, including discussion of its computa- tional complexity and conditions of strategies admissibility and

2)examples of application areas and description of implemented multiagent systems.

II. MULTIAGENT SOLVER

Multiagent-planning approaches are used for solving a wide variety of planning problems. As analyzed by Brafman and Domshlak [5], the multiagent-planning techniques can be ben- eficial for such problems, where the domain sizes of individ- ual agents are considerably smaller (e.g., in logarithmic rela- tion to each other) than the overall size of the problem (even if the planning complexity of an individual agent is expo- nential) and the number of dependencies between agents is low.

The distributed planning and problem solving has been ana- lyzed by Durfee [1]. One of the related strategies discussed is a task-sharing approach. The principle is based on passing of tasks from a busy agent to a vacant agent(s). The process can be summarized in four basic steps.

1)Task decomposition: The tasks of agents are decomposed into subtasks. Sharable subtasks are selected.

2)Task allocation: The selected tasks are assigned to the vacant agents or agents, which ask for them.

3)Task accomplishment: Each agent tries to accomplish its (sub)tasks. The tasks, which need further decomposition are recursively processed and passed to other agents.

4)Result synthesis: The results of the tasks are returned to the allocating agent, since it is aware how to use it in the context of the higher tasks.

From the perspective of distributed problem solving, task allocation and the result synthesis are the most crucial parts. However, from the planning perspective, the other two phases are more important. The allocation problem is usually solved by contracting and negotiation techniques, which imply problems related to the resource allocation domain, e.g., cross booking, overbooking, backtracking, and others. In the allocation phase, a hierarchy of agents is established, which may not be fixed in heterogeneous multiagent systems.

The decomposition and delegation principle is widely used in agent-based approaches for problem solving and planning, and shows great applicability to realistic problems. Taking into ac- count Brafman and Domshlakís analysis, Durfeeís task-sharing approach efficiency is tightly bound to the solverís ability to

1094-6977/$26.00 © 2010 IEEE

32 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICSóPART C: APPLICATIONS AND REVIEWS, VOL. 41, NO. 1, JANUARY 2011

Fig. 1. Abstract architecture of agent-based solver/planner.

reduce the problem sizes for individual agents and keeping the dependencies between agents low.

In the domains, where the optimization/planning problem can be decomposed into independent tasks the multiagent approach shows its benefits. Such a task can be allocated and executed by different agents with low or no influence on each other. In this paper, we assume that such problem decomposition exists. In the rest of the paper, we assume that tasks are independent if not stated otherwise.

We can define the abstract multiagent solver architecture as a composition of three types of agents (see Fig. 1).

1)Task agent: This agent is for preprocessing of the problem. It should use a domain-specific heuristic, generic ordering strategy, and randomized method.

2)Allocation agent: This agent is for problem decomposition into tasks and delegation of the tasks to resource agents. It maintains task allocation and result synthesis. This agentís strategies and algorithms are domain-independent.

3)Resource agent: This agent is for individual case-specific resource planning. In case of further decomposition, the task is handed over to another task agent.

The multiagent system built upon this architecture is com- posed of one task Agent, one allocation agent, and a set of re- source agents. The resource agents represent distributed nature of the multiagent problem.

For complex hierarchical systems, this abstract architecture can be reflected in the multiagent system recurrently, it can be reduced (i.e., one agent undertakes a role of more than one ab- stract agent type), or it can be parallelized (more abstract solvers are instantiated with potentially overlapping agents, e.g., sev- eral task agents or allocation agents handling various problems in parallel). In large systems, concurrent interactions may arise that need to be handled. The agentsí interactions are guided by interaction protocols, which are mostly built on Smithís contract net protocol (CNP) [6].

A. Multiagent Problem

The multiagent solver uses the principles of problem decom- position and delegation to autonomous agents that solve parts of the problem individually. The overall solution is then obtained by merging the individual agentsí results.

The optimization based on CNP interactions in cooperative environments is usually described as utilitarian social welfare maximization [7]. Therefore, the abstract algorithm objective function can be defined as maximization of social welfare, which is as follows:

sw = ua (1)
  a∈A  

where A = a1 , . . . , an is the population of agents and ua is the utility of agent a. In our case, the social welfare can be computed as a sum of resource agents (R A) utilities that can be defined as follows:

ua = (rew(t) - cost(t, a)) = rew(t) - cost(Ta )
  t∈Ta t∈Ta
    (2)

where Ta is a set of tasks allocated to the agent a R, rew(t) is a reward for fulfilling task t, cost(t, a) is the cost of agent a to perform task t, and

cost(Ta ) = cost(t, a) (3)
  t∈Ta  

is the cost of the overall plan of an agent. The total reward for fulfilling a set of all tasks T is as follows:

rew(T ) = rew(Ta ) = rew(t) (4)
  a∈R a∈R t∈Ta  

so the social welfare can be expressed as follows:

sw = rew(T ) - cost(Ta ) = rew(T ) - cost(t, a). (5)
a∈R   t∈T  

Since we assume the same quality of task fulfilling by any agent, the reward k = rew(T ) is not influenced by the allocation of tasks to the agents. We can derive social welfare as follows:

sw = k - cost(t, a). (6)

t∈T

As denoted earlier, the goal of CNP-based multiagent optimiza- tion in cooperative environments is social welfare maximization. Given by (6), it is the same as minimization of solution cost, where cost(t, a) is evaluated by the resource agent a undertaking task t. The objective function of the abstract solver is then

cost(t, a). (7)
t∈T  

The task allocation stage of the solver searches for the best suitable mapping of the tasks T to the resource agents R that minimizes the objective function given by (7). We can define the goal of the allocation as finding such a partition P of the set

of tasks T that v  
   
argmin cost(Ti ) (8)
P i=1  
ˇ ¥  
VOKRINEK et al.: ABSTRACT ARCHITECTURE FOR TASK-ORIENTED MULTI-AGENT PROBLEM SOLVING 33

where v is the number of resource agents, Ti is a subset of tasks allocated to the resource agent ai , cost(Ti ) is the cost of the overall plan of agent ai performing Ti defined by (3), and

v  
Ti T , i=1 Ti = T (9)
i, j : Ti n Tj = iff i 3=j. (10)

B. Abstract Algorithm

The abstract algorithm representing the presented multiagent solver attempting to minimize objective function defined by (7) is captured by Algorithm 1. According to the abstract architec- ture depicted in Fig. 1, it contains three phases as following.

1)The first phase of the function solve is task preprocess- ing provided by the task agent. The ordering heuristic represents case-specific sorting of the tasks to increase the solverís efficiency in the particular domain. In some cases, the ordering has no influence, but in others, it may pro- vide significant improvement especially in domains with stronger task dependencies.

2)The second phase is iteration over all tasks and allocation performed by the allocation agent minimizing the inser- tion cost computed by resource agents (the allocateCNP function). As part of this iteration, the dynamic improve- ment based on cooperation of allocation agent and all resource agents takes placeóthe improvement strategy is applied to every resource agent after allocation of each task (see following for the description of improvement strategies).

3)The third phase of the solve function is the final im- provement of the solution. After allocation of all tasks the

improvement strategy is executed by all resource agents. The algorithm is based on local optimization of a single-task insertion and subsequent improvement. Each iteration of the algorithm provides a greedy (order-dependent) task allocation supported by locally optimized solution of resources utilization (which can be seen from global point of view as hill-climbing search). The algorithm does not use any backtracking mecha- nism or exhaustive search of the state space. It has a significant impact on the algorithmís computational complexity (see Sec- tion II-C), but it is susceptible to finding locally efficient solution only. The global solution quality is improved by execution of

improvement strategies.

The resource agent uses a case-dependent resource-planning heuristic for these computations. The functions for allocation are as follows.

1)Insertion estimation costestI (t, a): The estimation of the cost of the task insertion. It represents the increase of the agentís a cost function caused by undertaking the task t.

2)Insertion costin ser t (t, a): The real cost of the task inser- tion. This value is determined by adding a new task t to the plan of the agent a in the current state. It is the result of the particular resource-planning algorithm of the resource

agent.

The opposite functions used by improvement strategies are as follows.

1)Removal estimation costestR (t, a): The estimation of the cost of the task removal. It represents the decrease of the agentís a cost function caused by dropping the task t.

2)Removal costr em ov e (t, a): The real cost of the task re- moval. This value is determined by removing the task t from the plan of agent a in the current state. It is the re-

sult of the particular resource-planning algorithm of the resource agent.

The allocation in the CNP part of the Algorithm 1 is based on the determination of the winner agent. The winner of task t is a resource agent a with the lowest insertion cost; therefore,

winner = argmin costestI (t, a). (11)
a∈R    

The allocation agent allocates an unallocated task t T , where ai R : t / Tai to a winner agent a

allocate(Ta , t) t Ta (12)

provided that local plan of agent a exists and the agent a is able to fulfill this task using the plan for the cost estimation costestI (t, a) used in (11).

34 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICSóPART C: APPLICATIONS AND REVIEWS, VOL. 41, NO. 1, JANUARY 2011

One of the improvement strategies (see following) is based on identification of the most resource consuming task of resource agents. We define worst task tw of agent a as follows:

tw = argmax costestR (t, a) (13)
t∈Ta    

i.e., the savings (difference of the total plan cost with and without this task) are maximized.

The improvement strategies basically swap tasks between agentsówe say an agent ai delegates task t Ti to an agent aj

delegate(Ti , Tj , t) t / Ti t Tj . (14)

The admissible delegation of task t from agent ai to agent aj , where i 3=j is a delegation that satisfies the improvement

condition

costestR (t, ai ) - costestI (t, aj ) > 0. (15)

The improvement strategies used by abstract algorithm are as follows (see Algorithm 2 for more details).

1)Delegate worst (DW): Each resource agent ai identifies its worst task tw according to (13) and tries to delegate it to another agent aj if the improvement condition defined by (15) holds and aj is the allocation winner according to (11).

3)Reallocate all (RA): Each resource agent successively re- moves all its tasks from the plan and allocates them again using the CNP strategy. The result of the allocation can be the same as before task removing, or a change of the position of the task in the current agent plan, or delegation to another agent. To ensure proper function of RA im- provement strategy, we require for successive removing and inserting of task t from/to agent a that

costinsert (t, a) = costrem ove (t, a)   (16)
i.e., when removing and reinserting task t Ta , the
cost(Ta ) does not increase.    

C. Complexity Analysis

The general computational complexity of the multiagent solver is introduced in [5]. Using transformation of the multiagent-planning problem to the distributed constraint sat- isfaction problem, the worst case time complexity of the multi- agent planning is upper bounded by

f (I) × exp(comm) + exp(int) (17)

where f (·) is the factor inducted by requesting each agent to plan, while committing to a certain sequence of actions, I is the complexity of an individual agentís planning, exp(comm) represents a factor exponential in minñmax number of per-agent commitments, and an additive factor exp(int) represents the interactions of agents.

The consequences of (17) lead to interesting features of the multiagent solver, such as that there is 1) no direct exponential dependence on the number of agents; 2) no direct exponential dependence on the size of the planning problem or size of the joint plan; and 3) no direct exponential dependence on the length of individual agent plans [5].

In our case, the feature 2) resulting from (17) does not have a strong impact if the decomposition algorithm of the alloca- tion agent is exponential because the size of its problem is the same as the size of the overall problem, but for the resource agents (and other subordinate agents in the case of a complex hierarchical structure) this feature holds. However, the exponen- tial factors are usually reduced by the polynomial heuristicsó decomposition, allocation, optimization, and resource-planning strategies implemented in real applications. The ordering strat- egy of the task agent does not have a strong influence on the worst case complexity because of its additive nature and low complexity (provided that tasks can be compared in constant time). The multiagent solver benefits in the domains, where problem can be easily decomposed to independent tasks, and/or where the polynomial heuristics for the resource planning exist.

For the abstract algorithm (see Algorithm 1), (17) can be represented as follows:

n × log(n) + n × Oalloc + n × m × Oim pr (18)

where n denotes the number of tasks and m is the number of

2)Delegate all (DA): Each resource agent ai delegates all resource agents. The first part represents the ordering heuristic its tasks Ti to the winner of each task if the improvement and its complexity corresponds to the complexity of standard

condition is satisfied. sorting algorithms. The middle part is the complexity of the allo-
ˇ ¥  
VOKRINEK et al.: ABSTRACT ARCHITECTURE FOR TASK-ORIENTED MULTI-AGENT PROBLEM SOLVING 35

cation part of the algorithm and the last part is the improvement part of the algorithm.

The allocation and improvement time complexities of the algorithm are defined as follows:

Oalloc = m × O(costestI (t, a)) + O(costinsert (t, a))

Oim pr = f i(n3)

where n3 = n/m is the average number of tasks allocated to a particular resource agent and f i(n3) is the factor representing the complexity of the implemented agentsí improvement strategy. We assume

O(costestI (t, a)) = O(costinsert (t, a)) = f r(n3) O(costestR (t, a)) = O(costrem oval (t, a)) = f r3(n3)

where f r(n3) is the factor representing the complexity of the implemented agentsí resource-planning strategy for the task in- sertion and f r3(n3) is the factor representing the complexity of the implemented agentsí resource-planning strategy for the task removal. Therefore, the general worst case time complexity of the presented abstract algorithm is as follows:

n × (log (n) + m × (f r (n3) + f i (n3))) . (19)
       

The complexity of improvement strategies defined in Sec- tion II-B are as follows:

f iDW (n3) = Oworst + f r3(n3) + m × f r(n3) f iDA (n3) = f iRA (n3) = n3 × (f r3(n3) + m × f r(n3))

where Oworst is the complexity of identification of the worst task in the plan, i.e., finding the task with greater costestR (t, a). It can be upper bounded by the iteration through all n3 tasks and computation of removal cost of each one, so its worst case complexity is upper bounded by Oworst = O(n3 × f r3(n3)). For combination of all described improvement strategies, the im- provement complexity is upper bounded by

f i(n3) = n3 × f r3(n3) + m × n3 × f r(n3). (20)

The factor m × n3 = m × (n/m) = n, so the improvement complexity has no relation to the number of agents. Combining (19) and (20) and taking n3 = n, the worst case time complexity of the abstract algorithm is as follows:

n × log(n) + n2 × f r3(n) + m × n2 × f r(n). (21)

The presented complexity analysis shows the polynomial im- pact of the decomposition and delegation principles used by the multiagent abstract solver. The impact of the two factors introduced by (17) is the following:

1)the complexity of the operations of resource agent are multiplied by n2 ;

2)the influence of the number of resource agents is linear. The complexity analysis shows us an important feature of

agent-base solver. When using polynomial heuristics for task insertion and removal, the implemented multiagent solver pro- vides polynomial worst case complexity. Together with linear computational scaling with the number of agents makes the pre- sented abstract architecture suitable for many application areas.

In real-application areas, ordering heuristics can be found that result in allocation with no need of using improvement strategies (e.g., production-planning system described in Section III-D). In other cases, the planning of resource agents is implemented with low complexity (e.g., linear) and the improvement strat- egy has greater importance. In Section III, we present several applications developed using the described abstract multiagent solver.

D. Incremental Improvement Strategy

The basic multiagent solver can be enhanced by the incre- mental improvement strategy. Algorithm 1 allocates the tasks and runs improvement strategies (both dynamic and final), keep- ing the computational complexity low (i.e., guaranteeing a good response time of the algorithm). In many cases, it is benefi- cial to perform incremental improvement of the solution when we have enough time to wait for better results or the environ- ment changes dynamically. The changes of the task constraints, resources availability, and execution uncertainty affects the effi- ciency of the static plan during execution. Algorithm 1 optimizes

(7) in a single run upon conditions that are valid at the time of algorithm execution. When some conditions change, the solu- tion quality diverts. In such a case, the incremental improvement strategy should be used to keep the solution up to date with the dynamic environment changes and/or improve the solution over the time.

The nature of the task allocation process of the multiagent solver enables us to run the improvement strategies continuously and interrupt its improvement during any run without loosing the correctness of the solution (assuming the delegation as atomic process). The incremental improvement strategy described in Algorithm 3 is an anytime algorithm monotonically improving the quality of the solution. The improvement strategies (see Algorithm 2) are executed until the solution is no longer being improved (i.e., the overall cost of the solution defined by (7) does not change between iterations).

36 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICSóPART C: APPLICATIONS AND REVIEWS, VOL. 41, NO. 1, JANUARY 2011

The inner loop iterates over all resource agents and pro- vides the same features (complexity and convergence) as the improvement part of Algorithm 1. The outer loop terminates when the quality of two consequent solutions is the same.

5)the algorithm execution time must be low compared to the frequency of the environment changes (the environment is considered static during the execution of the algorithm).

E. Resource Agent Strategy Admissibility

The presented multiagent solver features strongly depends on the functions provided by the resource agents. We investi- gate the allocation process correctness and delegation stability, and define the constraints to functions provided by admissible resource agent strategy.

When using standard CNP, we require the estimation func- tions of resource agent a to provide accurate estimations of inserting/removal of task t, e.g.,

costestI (a, t) = costinsert (a, t) (22)
costestR (a, t) = costrem ove (a, t). (23)

In the case of more advanced allocation protocols with back- tracking (e.g., extended CNP [8]), this accuracy can be relaxed. In all cases, these constraints defined on the resource agent func- tions ensure that Algorithm 1 is able to find a local minimum of the cost function (7) and to guarantee the proper behavior of improvement strategies.

The estimation functions provided by resource agent strategy are admissible only if

costestI (a, t) = costinsert (a, t) (24)
costestR (a, t) = costrem ove (a, t) (25)
where for the upper bound of the estimation error holds  
|costinsert (a, t) - costestI (a, t)| < eI (26)
|costrem ove (a, t) - costestR (a, t)| < eR (27)

and delegation improvement condition according to (14) and (15) is modified to

costestR (t, ai ) - costestI (t, aj ) > eI + eR . (28)

Keeping the defined constraints (improvement condition, es- timation error, and admissibility of estimation functions), the execution of any of the improvement strategies results in re- duction of the solution cost or the solution is unaffected (i.e., the new solution is not worse than the previous one) and the incremental improvement strategy termination is ensured after a finite number of steps (if the environment does not change during the execution of the improvement algorithm).

Using features discussed in previous sections, we can define the admissible resource agent strategy (its internal heuristics and estimation, and allocation functions) has to:

1)use admissible estimation functions according to (24) and (25);

2)bound the estimation error according to (26) and (27);

3)fulfill the improvement condition defined by (28) (the del- egation is admissible if the solution cost improves);

III. APPLICATIONS

Due to high flexibility, robustness, and scalability, the agent technologies promise a wide industrial applicability [9] even in the real-time environments with a need of dynamic reconfigu- ration and replanning [10]. This section presents four examples of the multiagent systems implemented using the presented ab- stract architecture. It demonstrates the applicability of the con- cept in a wide variety of real-problem domainsóvehicle rout- ing problems (VRPs), strategic mission planning, multirobot frontiers exploration, and production planning. All presented multiagent systems share the same abstract architecture, algo- rithm, and improvement strategies. For each application domain, a particular resource optimization heuristic has been designed and implemented by the resource agents. All the presented sys- temsí implementations have low-computational complexity and provide high-quality solutions.

A. VRP Solver

The VRP is a well-known optimization problem. The problem is NP-hard and is defined as routing of a fleet of gasoline delivery trucks between a terminal and a number of service stations. The trucks have load capacity limitations and deliveries have to be accomplished at minimum total cost (distance traveled).

The agent-based approach to a variant of the VRP solver has been presented, for example, in [11]. Zeddini et al. use three types of agentsóclient, bidder, and vehicle agents. The approach is based on the CNP allocation and the optimization is based on exchange of tasks between the vehicle agents. The vehicle agents use an insertion heuristic and improvement strat- egy for task swapping between them. The error of the solution (compared to the optimal solution) presented in the paper is 4%ñ29% for standard benchmark problems.

A similar approach has been used in [12] for a dynamic variant of k-VRP (new tasks are added during the execution), where the initial allocation is generated using a centralized algorithm. The dynamic task allocation is made by the CNP protocol. Then, two improvement phases are applied. The intraroute optimization is applied to each agent route and interroute optimization is performed between vehicle agents.

The VRP family solver built upon an abstract multiagent solver, which is introduced by this paper is presented in [13]. The solver uses three ordering strategies, the combination of all improvement strategies with dynamic improvement enabled or disabled, and the resource agent algorithm implemented using a standard cheapest insertion heuristic for traveling salesman problem [14].

The objective function is based on the distance traveled by vehicles (represented by resource agents) and fully corresponds

4)when using the reallocate-all improvement strategy, (16) to (7). The implemented RA strategy fulfills the improvement must hold (i.e., when reallocating, the new solution has to conditions defined by (15) and (16) and admissibility conditions

be better or at least the same as the previous one); defined by (22) and (23).
ˇ ¥  
VOKRINEK et al.: ABSTRACT ARCHITECTURE FOR TASK-ORIENTED MULTI-AGENT PROBLEM SOLVING 37

Fig. 2. Example of the VRP solver result.

The complexity of the resource agent cost computation func- tions is kept low, i.e., f r(n3) = O(n3) and f r3(n3) = O(1). The worst case time complexity defined by (21) (using n3 = n/m) is O(n3 ), which has been also proved experimentally on bench- mark instances.

The example of the solution provided by the solver is in Fig. 2. There are 52 nodes served by seven vehicles (an optimal number of vehicles has been obtained). The circles in the nodes represent the size of transportation demand. The quality of the

solution to the problem presented in Fig. 2 is 93.4% compared to the optimal path length. We measure the quality as costs o l v e r ×

costo p t i m

100[%], where costsolver is the solution cost provided by the solver and costoptim is the cost of the best known solution. The optimal vehicleís path is depicted by thin green lines and the

solution produced by the multiagent solver is represented by thick red lines.

For the 115 evaluated benchmark instances (standard VRP benchmark problems), the multiagent solver provides solutions with the quality of at least 81% compared to the optimal so- lution with average quality better than 91% (corresponds to solution error of 19%, respectively, 9%). The self-organizing capability of the system successfully minimizes the number of vehicles used. The results show that the application of dynamic improvement strategy provides better results than batch process- ing with final improvement strategy only. The best performance has been reached using DA and RA improvement strategies. The implemented solver demonstrates very good applicability of the abstract multiagent solver to the family of routing problems and easy adaptation to problem variants.

B. Strategic Mission Planning Using Social Commitments

The multiagent abstract solver presented in this paper has been used in a system for distributed planning and coordination

in dynamic nondeterministic multiactor mixed-initiative envi- ronment [15]. The system provides flexible planning, replan- ning, and task allocation. The system addresses several issues that have to be solved in order to fulfill the requirements on a system planning in dynamic nondeterministic environments. An overview of the problems is as follows.

1)Distributed planning: Planning in such an environment is practically realizable only as a distributed process.

2)Distributed resource allocation: An integral part of the planning process is resource allocation both of the acting entities in the world and of the static resources, and as the planning process is distributed, the resource allocation has to be distributed as well.

3)Distributed plan execution and synchronization: Consti- tuted distributed plan consisting of several personal plans has to be executed by the entities. The personal plans need

to be coordinated in distributed manner, as the entities do not know each otherís plans.

From the perspective of allocation agent, the planning pro- cess can be divided into three phases: 1) the hierarchical task network I-Plan planner [16] is used for creating an abstract plan for a long-time horizon; 2) the plan instantiating process uses a distributed resource allocation based on the CNP [6]; and 3) with the help of this protocol, the appropriate subordinate agents (resource agents) are found and the responsibilities for the plan actions are fixed. The internal resource agents opti- mization uses the as-early-as-possible scheduling heuristic. The heuristic causes the earliest possible execution of the plan ac- tions, which affects the length of the whole plan in the nonde- terministic environment. The effect is directly proportional to the amount of nondeterminism in the world (which has been simulated as random prolonging of the actions durations). The agentsí hierarchy in the system is captured by Fig. 4, there is a commander creating tasks for builders, builders implementing the abstract task agent and allocation agent roles in parallel, and finally, we have a set of trucks implementing the abstract resource agent roles.

All plans are described in the form of social commitments substituting plan actions. The commitment is a knowledge-base structure describing agentís obligation to change the world-state and a set of rules defining what the agent should do if the obliga- tion is not satisfiable [17]. The commitments enable expressive description of the decommitment rules, and thus, the replanning process, it captures the improvement strategies executed on the moment when the environment changes in the way that collide with the plan.

In other words, the replanning process (i.e., the plan im- provement process) by means of social commitments can be described as successive recommitting [17]. For the decommit- ting purposes, three basic decommitment rules were used: full decommitment, delegation, and relaxation [18].

The deployment scenario of the system is a disaster relief operation, where the resources have to be transported to the impacted zone [19]. There are material resources, emergency units, and transport units in the scenario. The emergency units create plans and request the transportation for itself and for required material resources. The problem is to find such a global

38 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICSóPART C: APPLICATIONS AND REVIEWS, VOL. 41, NO. 1, JANUARY 2011

Fig. 5. Cooperative frontiers explorationóscenario screenshot.

Fig. 3. Strategic mission planning systemóscenario island screenshot.

Fig. 4. Strategic mission planning systemóhierarchy of agents.

plan that all units and material are transported to the demanded area as soon as possible using limited transportation resources (see a screenshot of the scenario island in Fig. 3).

The entire system provides fast convergence to the effi- cient solution with time complexity of O(n3 ). The heuristic of the resource agent strategy is admissible in terms defined in Section II-B and provides f r(n3) = O(1) and f r3(n3) = O(n3). The plan execution stability in the dynamic environment is en- hanced using improvement strategies captured by decommit- ment rules, and thus, incrementally invoked abstract algorithm presented in Section III.

C. Cooperative Frontiers Exploration

The presented multiagent solver has been also utilized for cooperative frontiers exploration problem [20]. The problem is to find the shortest path for a convoy of vehicles through a partially known urban area. The street map of the area is a priori known, but the actual condition of the routes is not. There is the convoy moving through the city using a path-

planning algorithm that incorporates the information obtained by a set of small autonomous vehicles. These vehicles explore the area ahead of the convoy (see the scenario screenshot in Fig. 5).

The goal of the multiagent system is to find the shortest path through the area ensuring that 1) the convoy will not stop or u- turn because of a street blockade and 2) the total traveled path of all vehicles is minimized. The multiagent system is composed of one convoy agent (task agent and allocation agent role) and a set of unmanned ground vehicles (UGV) agents (resource agent role). Besides path planning and navigating through the city, the convoy agent generates a set of frontiers for exploration [21]. In this case, it is not a classical mapping task in an unknown area, but the frontiers represent points of interests in the known street map that should be investigated to ensure the convoy may freely pass through the desired route. A similar robotic problem has been also solved using a multiagent system for distributed robotic planning, e.g., in [22]. The first goal (convoy nonstoping movement toward a target) is secured using the incremental re- planning algorithm for convoy path planning based on anytime planning algorithm D-star [23]. The second goal (minimization of the traveled path) is handled by a multiagent solver using the presented architecture and algorithms. It is shown [20] that the multiagent solver provides almost real-time response and signif- icantly reduces the convoy traveled path even with small number of UGVs keeping the overall traveled distance overhead low.

The convoy agent dynamically creates points of interests (ex- ploration frontiers) and allocates them to the UGV agents using Algorithm 1. In the case of any new information is discovered, Algorithm 3 is used and also new frontiers are generated and al- located (or some frontiers can be removed). The UGV agents use a route optimization heuristic that attempts to minimize the trav- eled distance similar to the VRP described earlier. It fulfills the improvement condition defined by (15) and admissibility con- ditions defined by (24) and (25). The computational complex- ities of the strategy are f r(n3) = O(n3) and f r3(n3) = O(1). The experimentally evaluated worst case time complexity of the multiagent solver has been upper bounded by O(n3 ).

ˇ ¥  
VOKRINEK et al.: ABSTRACT ARCHITECTURE FOR TASK-ORIENTED MULTI-AGENT PROBLEM SOLVING 39

Fig. 6. Screenshot of multiagent production-planning system.

D. Production Planning

Classical-planning systems (using scheduling algorithms with various heuristics, constraint logic programming, genetic algorithms, and simulated annealing [24], [25]) work centrally and allocate resources usually in one run for every product (order) presented in the system. These methods use mostly stochastic algorithms and generate near-optimal solutions for minimization of defined criteria (e.g., a sum of weighted tar- diness and inventory costs). Such a solution is fully sufficient, while the required replanning and rescheduling affects the en- tire plan. The plan is usually completely rebuilt and a random aspect of the algorithms can cause major (unwanted) changes in plans after replanning. This might not be suitable for many manufacturing areas. With physical distribution of the produc- tion units, it is advantageous to decompose and distribute the planning problem [1].

The multiagent technology addresses all the phases of the manufacturing decision-making support, while there are few implementations of multi-agent systems that cover more than a single stage. There are solutions for low-level scheduling or control systems, the product configuration and quotation phases to short-term and long-term production planning and supply- chain management [26].

The example of the system that uses the same conceptual fun- damentals as multiagent abstract solver presented by this paper is presented in [27] and [28]. One of the goals of the multiagent system is to create a production plan for middle- and long- term horizon to give an overview of resource utilization (see Fig. 6 for a screenshot of a multiagent production-planning sys- tem). The production resources are represented by the resource agents maintaining the constrains and capabilities of individual production workshops. The task agent uses a classical task or- dering heuristic based on weighted earliest deadline first, which significantly improves the solution and there is no need for im- provement strategy execution after allocation phase of the batch of tasks. The production-order decomposition and planning is

provided by the planning agent, which allocates the parts of the production order to the resource agents using the CNP according to Algorithm 1. In case of environment changes (e.g., an update of production times estimation, machines breakdowns, delays in production, etc.), Algorithm 3 is executed. The solution of the solver is optimal according to the cost computed as a weighted delay penalty.

Resource agents use an admissible strategy [according to (22), (23), and (15)] that minimizes the weighted average de- lay of the production orders (see [29] for more details). The computational complexities of resource agent algorithms are f r(n3) = f r3(n3) = O(n3); therefore, the overall solver com- plexity is upper bounded by O(n4 ).

IV. CONCLUSION

This paper describes an abstract multiagent solver architec- ture and an algorithm for implementing a wide variety of prac- tical multiagent-planning and problem-solving systems. The al- gorithm maximizing social welfare of the cooperative agent community is introduced and analyzed. CNP-based task alloca- tion and solution improvement using task delegation provides a powerful tool for problem solving when keeping the compu- tational complexity within reasonable limits. We also discuss the limitations and admissibility constraints of the resource op- timization heuristic that has to be designed to implement the multiagent solver for a particular problem domain and define the resource agent strategy admissibility.

The solver benefits in domains, where decomposition of a problem into independent tasks is possible. Finding indepen- dent subsets of tasks significantly reduces the need for interac- tions between agents. The tasks are planned and executed by individual agents independently, and the overall solution is then merged from the partial ones. The system is able to balance the allocation of the tasks to the agents and provide anytime monotonically improved solution.

The applicability of the presented abstract multiagent solver is demonstrated on several real systems operating in the do- mains of VRPs, strategic mission planning, multirobot frontiers exploration, and production planning. In all application areas, the implemented system provides low-computational complex- ity (O(n3 ) or O(n4 )) with good solution quality.

ACKNOWLEDGMENT

The authorsí organizations and research sponsors are autho- rized to reproduce and distribute reprints and online copies for their purposes notwithstanding any copyright annotation hereon. The views and conclusions contained herein are those of the au- thors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or im- plied, of other parties.

REFERENCES

[1]E. H. Durfee, ìDistributed problem solving and planning,î in A Modern Approach to Distributed Artificial Intelligence, G. Weifl, Ed. San Fran- cisco, CA: The MIT Press, 1999, ch. 3.

40 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICSóPART C: APPLICATIONS AND REVIEWS, VOL. 41, NO. 1, JANUARY 2011

[2]M. E. DesJardins and M. J. Wolverton, ìCoordinating a distributed plan- ning system,î AI Mag., vol. 20, no. 4, pp. 45ñ53, 1999.

[3]E. Ephrati and J. S. Rosenschein, ìA heuristic technique for multiagent planning,î Ann. Mathematics Artificial Intell., vol. 20, no. 1ñ4, pp. 13ñ67, 1997.

[4]M. M. de Weerdt, A. Bos, J. Tonino, and C. Witteveen, ìA resource logic for multi-agent plan merging,î Ann. Mathematics Artificial Intell., Special Issue Comput. Logic Multi-Agent Syst., vol. 37, no. 1ñ2, pp. 93ñ130, Jan 2003.

[5]R. I. Brafman and C. Domshlak, ìFrom one to many: Planning for loosely coupled multi-agent systems,î in ICAPS, 2008, pp. 28ñ35.

[6]R. G. Smith, ìThe contract net protocol: High level communication and control in a distributed problem solver,î IEEE Trans. Comput., vol. C-29, no. 12, pp. 1104ñ1113, Dec. 1980.

[7]K. J. Arrow, A. K. Sen, and K. Suzumura, Eds., Handbook of Social Choice and Welfare (Handbooks in Economics). Amsterdam: North Holland, Aug. 2002.

[8]K. Fischer, J. P. Muller, M. Pischel, and D. Schier, ìA model for cooperative transportation scheduling,î in Proc. First Int. Conf. Multiagent Syst. Menlo park, California: AAAI Press/MIT Press, Jun. 1995, pp. 109ñ116.

[9]V. Marik and J. Lazansky, ìIndustrial applications of agent technologies,î Control Eng. Practice, vol. 15, no. 11, pp. 1364ñ1380, 2007.

[10]P. Vrba and V. Marik, ìCapabilities of dynamic reconfiguration of multiagent-based industrial control systems,î IEEE Trans. Syst., Man, Cybern. A, Syst. Humans, vol. 40, no. 2, pp. 213ñ223, Mar. 2010.

[11]B. Zeddini, M. Temani, A. Yassine, and K. Ghedira, ìAn agent-oriented approach for the dynamic vehicle routing problem,î in 2008 Int. Workshop Adv. Inf. Syst. Enterprises, vol. 0, pp. 70ñ76, 2008.

[12]D. Barbucha and P. Jedrzejowicz, ìAgent-based approach to the dy- namic vehicle routing problem,î in 7th Int. Conf. Pract. Appl. Agents Multi-Agent Syst, ser. Advances in Soft Computing, vol. 55. Springer Berlin/Heidelberg, 2009, pp. 169ñ178.

[13]J. Vokrinek, A. Komenda, and M. Pechoucek, ìAgents towards vehicle routing problems,î in Proc. Int. Joint Conf. Autonomous Agents Multiagent Syst, pp. 773ñ780.

[14]D. J. Rosenkrantz, R. E. Stearns, and P. M. L. II, ìAn analysis of several heuristics for the traveling salesman problem,î SIAM J. Comput., vol. 6, no. 3, pp. 563ñ581, 1977.

[15]A. Komenda, J. Vokˇr¥inek, M. Pechouˇcek,ˇ G. Wickler, J. Dalton, and A. Tate, ìI-globe: Distributed planning and coordination of mixed-initiative activities,î in Proc. Knowl. Syst. Coalition Operations 2009, Chilworth Manor, Southampton, U.K., Mar.ñApr.

[16]A. Tate, ìIntelligible ai planning,î in Proc. 20th ES Res. Develop. Intell. Syst. XVII, M. Bramer, A. Preece, and F. Coenen, Eds. Springer, 2000,

pp.3ñ16.

[17]A. Komenda, M. Pechouˇcek,ˇ J. B¥iba, and J. Vokˇr¥inek, ì Planning and re-planning in multi-actors scenarios by means of social commitments,î in Proc. Int. Multiconf. Comput. Sci. Inf. Technol., vol. 3. IEEE, Oct. 2008,

pp.39ñ45.

[18]J. Vokˇr¥inek, A. Komenda, and M. Pechouˇcek,ˇ ìDecommitting in multi- agent execution in non-deterministic environment: Experimental ap- proach,î in Proc. 8th Int. Joint Conf. Autonomous Agents Multiagent Syst., 2009, pp. 977ñ984.

[19]A. Komenda, J. Vokˇr¥inek, M. Pechouˇcek,ˇ G. Wickler, J. Dalton, and A. Tate, ìDistributed planning and coordination in non-deterministic en- vironments (demo),î in Proc. 8th Int. Joint Conf. Autonomous Agents Multiagent Syst, 2009, pp. 1401ñ1402.

[20]J. Vokrinek, A. Komenda, and M. Pechoucek, ìCooperative agent navi- gation in partially unknown urban environments,î in Proc. 3rd Int. Symp. Practical Cognitive Agents Robots., 2010, pp. 46ñ53.

[21]A. Visser, Xingrui-Ji, M. van Ittersum, L. A. G. ?lez Jaime, and L. A. Stancu, ìBeyond frontier exploration,î in RoboCup 2007: Robot Soc- cer World Cup XI, ser. Lecture Notes in Computer Science, vol. 5001. Springer Berlin/Heidelberg, 2008, pp. 113ñ123.

[22]S. Thayer, B. Digney, M. B. Dias, A. T. Stentz, B. Nabbe, and M. Hebert, ìDistributed robotic mapping of extreme environments,î in Proc. SPIE: Mobile Robots XV and Telemanipulator Telepresence Technologies VII, vol. 4195, Nov. 2000, pp. 84ñ95.

[23]M. Likhachev, D. Ferguson, G. Gordon, A. T. Stentz, and S. Thrun, ìAny- time dynamic a*: An anytime, replanning algorithm,î in Proc. Int. Conf. Automated Planning Scheduling, Jun. 2005, pp. 262ñ271.

[25]J. K. k, L. Rothkrantz and J. L. 1/2, ìGenetic algorithms with limited con- vergence,î in Information Processing with Evolutionary Algorithms, ser. Advanced Information and Knowledge Processing. London UK: Springer, 2005, pp. 233ñ253.

[26]M. Pechouˇcek,ˇ J. Vokˇr¥inek, and P. Becvˇa¥ˇr, ìExplantech: Multiagent sup- port for manufacturing decision making,î IEEE Intell. Syst., vol. 20, no. 1, pp. 67ñ74, Jan./Feb. 2005.

[27]J. Vokˇr¥inek, J. Hod¥ik, M. Pechouˇcek,ˇ P. Becvˇa¥ˇr, and J. Posp¥isil,ˇ Ex- traPlanT as a Multi-Agent System for Extra-Enterprise Collaboration. Information science Reference, 2008, ch. 593-600, pp. 593ñ600.

[28]J. Hod¥ik, P. Becvˇa¥ˇr, M. Pechouˇcek,ˇ J. Vokˇr¥inek, and J. Posp¥isil,ˇ ìEx- plantech and extraplant: multi-agent technology for production planning, simulation and extra-enterprise collaboration,î Int. J. Comput. Syst. Sci. Eng., vol. 20, no. 5, pp. 357ñ367, 2005.

[29]P. Becvˇa¥ˇr, P. Charvat,¥ M. P. J. Posp¥isil,ˇ A. Rˇ ¥iha, and J. Vokˇr¥inek, ìEx- plantech/extraplant: Production planning and supply-chain management multi-agent solution,î EXP - in search innovation, vol. 3, no. 3, pp. 116ñ 125, 2003.

Jirˇ¥i Vokrˇ¥inek received the Masterís degree in tech- nical cybernetics at Czech Technical University in Prague, Prague, Czech Republic, he is working to- ward the Ph.D. degree at the Agent Technology Cen- ter, Department of Cybernetics at Czech Technical University in Prague.

He is a Research Scientist at the Agent Tech- nology Center. His research interests include agent- based planning and simulation, artificial intelligence, multiagent systems, planning and replanning in man- ufacturing, virtual organizations, supply-chain man-

agement, and logistics.

Anton¥in Komenda received the Masterís degree in Technical Cybernetics at Czech Technical University in Prague, Prague, Czech Republic. He is currently working toward the Ph.D. degree from Agent Tech- nology Center of the Department of Cybernetics at Czech Technical University in Prague. His studies was focused on multiagent systems.

He is also a Researcher at the Agent Technology Center. His current research interests include multia- gent systems, distributed planning and plan repairing.

Michal Pechouˇcekˇ received the Masterís degree in IT: Knowledge-based systems from the University of Edinburgh, Edinburgh, U.K., and the Ph.D. degree in artificial intelligence and biocybernetics from Czech Technical University in Prague, Prague, Czech Re- public.

He is a Professor in artificial intelligence at the De- partment of Cybernetics, Czech Technical University in Prague. He is currently a Head of Agent Technol- ogy Center, Department of Cybernetics, CTU. His research interests include problems related to multi-

[24]N. Sadeh and M. S. Fox, ìVariable and value ordering heuristics for the job agent systems, especially topics related to social knowledge, metareasoning, shop scheduling constraint satisfaction problem,î Artif. Intell., vol. 86, acting in communication inaccessibility, coalition formation, agent reflection,

no. 1, pp. 1ñ41, 1996. and multiagent planning.