Thinking in terms of constraints

Lion Ralfs — Posted on

Anna, Bob and Carl cut a pizza into 8 slices. Anna wants to eat at least 1 slice, but no more than 5. Bob wants to eat less than Anna, and Carl can't eat more than 2 slices. There can't be any leftover pizza.

Think about how you would solve this by writing a program for it.

Being good at your job often means picking the right tool for it, which means it is good to have a wide variety of tools to choose from. I'm going to introduce a new tool to your repertoire in case you haven't heard about CSPs. I recently learned about CSPs and I wanted to write about them since they taught me to think about expressing a certain set of problems in a specific way. In my opinion, CSPs are just another way of modeling problems in software, the same way we think about selecting an architecture or picking the right algorithms and data structures.

The idea

CSP stands for Constraint Satisfaction Problem, which means we're thinking about expressing a problem in terms of constraints which have to be satisfied. A CSP consists of:

Variables are usually labelled using uppercase letters (such as A,B,CA, B, C). Each variable can take certain values, the set of values a variable can take is called its domain. If variable AA can take the values {1,2,3}{1, 2, 3}, we write dom(A)={1,2,3}dom(A) = {1, 2, 3}.

In our pizza example, the variables would be AA, BB and CC (for Anna, Bob and Carl). Without taking into account the specific rules, each of them can theoretically eat between 0 and 8 slices. Hence, their domains are:

Constraints are rules about the assignments of values to the variables, imagine them as a relationship between one or more variables. They could be unary relations, meaning that a variable can be in relation to itself. Each constraint also has a scope, a set which includes all variables that are involved in the constraint. Constraints can be evaluated using assignments of values to the variables in the constaint's scope. They always result in either true\text{true} or false\text{false}.[2]

In short, constraints are the rules about how our variables behave. In the pizza example, Anna wants to eat between 1 and 5 slices. We can express this using the following constraint: (1A5)(1 \leq A \leq 5). Its scope is the set of variables that are involved, so just AA. This is called a unary constraint, because there is only one variable involved. We could now try to assign values to AA from its domain and see if the constraint still holds. Obviously for such a simple example, we can see that assigning 0 or any number higher than 5 to AA results in the constraint evaluating to false\text{false}.

Modelling the rest of the constraints leads to the following constraints:

Notice how all three variables are involved in the last constraint, which we need to make sure that exactly 8 slices of pizza are consumed.

Constraint networks

CSPs can be visualized using constraint networks, which are graphs with a node for each variable and each constraint. They also have arcs (edges) between each variable and the constraints it is involved in. The variable identifiers are drawn using circles while constraints are represented using rectangles.[3]

If we were to turn our pizza problem into a constraint network, it would look like this:

AABBCC(B<A)(B \lt A)(C2)(C \leq 2)(1A5)(1 \leq A \leq 5)(A+B+C=8)(A+B+C=8)

Anna, Bob and Carl (our variables) are represented as circles, and all of the constraints we defined above are drawn using rectangles. There is a line between each variable and each constraint they participate in.

Finding solutions

So far, we are no closer to a solution of the problem, but we've modelled and visualized a constraint satisfaction problem. As it turns out, solving CSPs is not so easy. Luckily, some really smart people figured out algorithms that arrive at a solution reasonably fast. In 1977, Alan Mackworth developed the arc consistency (AC-3) algorithm[4], which operates on a constraint network.

The algorithm looks at arcs, which are the edges between the variables and constraints, and introduces the concept of arc consistency. An arc is consistent if every value in the domain of the variable still allows the constraint to be satisfied. The algorithm aims to make the entire network consistent by making every arc consistent.

The way that works is best explained with our pizza example. Take for example the arc A,(1A5)\langle A, (1 \leq A \leq 5) \rangle with dom(A)={0,,8}dom(A) = {0, \dots, 8}. It is not consistent, since some values violate the constraint (namely: 0, 6, 7, 8). The arc we're currently looking at is highlighted in red:

AABBCC(B<A)(B \lt A)(C2)(C \leq 2)(1A5)(1 \leq A \leq 5)(A+B+C=8)(A+B+C=8)

The algorithm then prunes these values from the domain of AA. However, this can cause other arcs that might have been arc consistent before to lose the consistency property. The algorithm mitigates that by re-checking all other constraints that AA is involved in, and marking all arcs between that constraint and all other variables other than AA in its scope as potentially "non consistent". They are highlighted in red:

AABBCC(B<A)(B \lt A)(C2)(C \leq 2)(1A5)(1 \leq A \leq 5)(A+B+C=8)(A+B+C=8)

It starts off with all arcs marked as "not consistent". Internally, it keeps a todo list of arcs to process and it terminates when all arcs are consistent. Feel free to check out the full algorithm yourself.

There are three different scenarios that might occur when the algorithm terminates:

Especially the last case might make it seem pointless to use the arc consistency algorithm in the first place, but consider the consistency of the constraint network as a tremendous advantage, due to how many values could have potentially been pruned by a lot. There are certainly other algorithms which can be applied afterwards that can lead to a definite answer.

Conclusion

Even without answering the pizza question, I hope you learned a bit about using constraints to model and solve certain problems that you encounter. There are a whole lot of other algorithms that are built on this idea of constraint satisfaction, but that's more than I wanted to write about. I can strongly recommend Poole and Mackworth's Artificial Intelligence: Foundations of Computational Agents, 2nd Edition if I've sparked your interest. Their entire book is available online.

References