Distributed constraint optimization (DCOP or DisCOP) is the distributed analogue to constraint optimization. A DCOP is a problem in which a group of agents must distributedly choose values for a set of variables such that the cost of a set of constraints over the variables is minimized.
Distributed Constraint Satisfaction is a framework for describing a problem in terms of constraints that are known and enforced by distinct participants (agents). The constraints are described on some variables with predefined domains, and have to be assigned to the same values by the different agents.
Problems defined with this framework can be solved by any of the algorithms that are designed for it.
The framework was used under different names in the 1980s. The first known usage with the current name is in 1990.[citation needed]
The main ingredients of a DCOP problem are agents and variables. Importantly, each variable is owned by an agent; this is what makes the problem distributed. Formally, a DCOP is a tuple , where:
The objective of a DCOP is to have each agent assign values to its associated variables in order to either minimize or maximize for a given assignment of the variables.
A value assignment is a pair where is an element of the domain .
A partial assignment is a set of value-assignments where each appears at most once. It is also called a context. This can be thought of as a function mapping variables in the DCOP to their current values:
f
can be thought of as the set of all possible contexts for the DCOP. Therefore, in the remainder of this article we may use the notion of a context (i.e., the function) as an input to the function.
A full assignment is an assignment in which each appears exactly once, that is, all variables are assigned. It is also called a solution to the DCOP.
An optimal solution is a full assignment in which the objective function is optimized (i.e., maximized or minimized, depending on the type of problem).
Various problems from different domains can be presented as DCOPs.
The graph coloring problem is as follows: given a graph and a set of colors , assign each vertex, , a color, , such that the number of adjacent vertices with the same color is minimized.
As a DCOP, there is one agent per vertex that is assigned to decide the associated color. Each agent has a single variable whose associated domain is of cardinality (there is one domain value for each possible color). For each vertex , there is a variable with domain . For each pair of adjacent vertices , there is a constraint of cost 1 if both of the associated variables are assigned the same color:
The distributed multiple- variant of the knapsack problem is as follows: given a set of items of varying volume and a set of knapsacks of varying capacity, assign each item to a knapsack such that the amount of overflow is minimized. Let be the set of items, be the set of knapsacks, be a function mapping items to their volume, and be a function mapping knapsacks to their capacities.
To encode this problem as a DCOP, for each create one variable with associated domain . Then for all possible contexts :
The item allocation problem is as follows. There are several items that have to be divided among several agents. Each agent has a different valuation for the items. The goal is to optimize some global goal, such as maximizing the sum of utilities or minimizing the envy. The item allocation problem can be formulated as a DCOP as follows.[2]
DCOP was applied to other problems, such as:
DCOP algorithms can be classified in several ways:[3]
ADOPT, for example, uses best-first search, asynchronous synchronization, point-to-point communication between neighboring agents in the constraint graph and a constraint tree as main communication topology.
Algorithm Name | Year Introduced | Memory Complexity | Number of Messages | Correctness (computer science)/ Completeness (logic) |
Implementations |
---|---|---|---|---|---|
ABT[citation needed] Asynchronous Backtracking |
1992 | [citation needed] | [citation needed] | Note: static ordering, complete | [citation needed] |
AWC[citation needed] Asynchronous Weak-Commitment |
1994 | [citation needed] | [citation needed] | Note: reordering, fast, complete (only with exponential space) | [citation needed] |
DBA Distributed Breakout Algorithm |
1995 | [citation needed] | [citation needed] | Note: incomplete but fast | FRODO version 1[permanent dead link] |
SyncBB[4]
Synchronous Branch and Bound |
1997 | [citation needed] | [citation needed] | Complete but slow | |
IDB
Iterative Distributed Breakout |
1997 | [citation needed] | [citation needed] | Note: incomplete but fast | |
AAS[citation needed] Asynchronous Aggregation Search |
2000 | [citation needed] | [citation needed] | aggregation of values in ABT | [citation needed] |
DFC[citation needed] Distributed Forward Chaining |
2000 | [citation needed] | [citation needed] | Note: low, comparable to ABT | [citation needed] |
ABTR[citation needed] Asynchronous Backtracking with Reordering |
2001 | [citation needed] | [citation needed] | Note: reordering in ABT with bounded nogoods | [citation needed] |
DMAC[citation needed] Maintaining Asynchronously Consistencies |
2001 | [citation needed] | [citation needed] | Note: the fastest algorithm | [citation needed] |
Secure Computation with Semi-Trusted Servers[citation needed] | 2002 | [citation needed] | [citation needed] | Note: security increases with the number of trustworthy servers | [citation needed] |
Secure Multiparty Computation For Solving DisCSPs (MPC-DisCSP1-MPC-DisCSP4)[citation needed] |
2003 | [citation needed] | [citation needed] | Note: secure if 1/2 of the participants are trustworthy | [citation needed] |
Adopt Asynchronous Backtracking[5] |
2003 | Polynomial (or any-space[6]) | Exponential | Proven | Reference Implementation: Adopt Archived 2006-09-16 at the Wayback Machine |
OptAPO Asynchronous Partial Overlay[7] |
2004 | Polynomial | Exponential | Proven, but proof of completeness has been challenged[8] | Reference Implementation: "OptAPO". Artificial Intelligence Center. SRI International. Archived from the original on 2007-07-15. |
DPOP Distributed Pseudotree Optimization Procedure[9] |
2005 | Exponential | Linear | Proven | Reference Implementation: FRODO (AGPL) |
NCBB No-Commitment Branch and Bound[10] |
2006 | Polynomial (or any-space[11]) | Exponential | Proven | Reference Implementation: not publicly released |
CFL Communication-Free Learning[12] |
2013 | Linear | None Note: no messages are sent, but assumes knowledge about satisfaction of local constraint | Incomplete |
Hybrids of these DCOP algorithms also exist. BnB-Adopt,[3] for example, changes the search strategy of Adopt from best-first search to depth-first branch-and-bound search.
An asymmetric DCOP is an extension of DCOP in which the cost of each constraint may be different for different agents. Some example applications are:[13]
One way to represent an ADCOP is to represent the constraints as functions:
Here, for each constraint there is not a single cost but a vector of costs - one for each agent involved in the constraint. The vector of costs is of length k if each variable belongs to a different agent; if two or more variables belong to the same agent, then the vector of costs is shorter - there is a single cost for each involved agent, not for each variable.
A simple way for solving an ADCOP is to replace each constraint with a constraint , which equals the sum of the functions . However, this solution requires the agents to reveal their cost functions. Often, this is not desired due to privacy considerations.[14][15][16]
Another approach is called Private Events as Variables (PEAV).[17] In this approach, each variable owns, in addition to his own variables, also "mirror variables" of all the variables owned by his neighbors in the constraint network. There are additional constraints (with a cost of infinity) that guarantee that the mirror variables equal the original variables. The disadvantage of this method is that the number of variables and constraints is much larger than the original, which leads to a higher run-time.
A third approach is to adapt existing algorithms, developed for DCOPs, to the ADCOP framework. This has been done for both complete-search algorithms and local-search algorithms.[13]
The structure of an ADCOP problem is similar to the game-theoretic concept of a simultaneous game. In both cases, there are agents who control variables (in game theory, the variables are the agents' possible actions or strategies). In both cases, each choice of variables by the different agents result in a different payoff to each agent. However, there is a fundamental difference:[13]
There are some intermediate models in which the agents are partially-cooperative: they are willing to decrease their utility to help the global goal, but only if their own cost is not too high. An example of partially-cooperative agents are employees in a firm. On one hand, each employee wants to maximize their own utility; on the other hand, they also want to contribute to the success of the firm. Therefore, they are willing to help others or do some other time-consuming tasks that help the firm, as long as it is not too burdensome on them. Some models for partially-cooperative agents are:[18]
Solving such partial-coopreation ADCOPs requires adaptations of ADCOP algorithms.[18]