To help make this discussion concrete, we consider simple examples of phase transitions in search. Perhaps the simplest consists of the size of the search tree arising during a depth-first backtrack search as a function of how well the local search heuristic is able to prune unproductive search choices [27]. Most of the cost in such searches is often due to attempts to recover from some early incorrect choices that preclude any possibility of a solution. In these situations, the time spent backtracking among unproductive additional choices before returning to the early choices made in the search is determined by how well the search heuristic is able to prune unproductive choices from consideration.

This problem can be formalized by supposing that the problem
consists of a series of *d* decisions
to be made and, at each decision point in the search, there are
*b* choices to consider. This gives a
search tree of depth *d* and branching
ratio *b*. For simplicity, we consider
the case where prior search choices eliminate all possible solutions to
the problem, or the problem itself has no solutions. In addition, we
suppose there is some way to identify unproductive sets of choices,
perhaps because they violate some constraints or through some domain
knowledge incorporated into a search heuristic. A simple model is to
assume that this identification eliminates each unproductive choice
independently and with probability . At the extremes, corresponds to perfect knowledge and
to a completely ineffective heuristic. This gives an
effective average branching ratio for the search of . This specification of a heuristic search problem
amounts to defining a simple ensemble of random problems, whose
properties can be easily evaluated exactly.

Completing the analysis of this problem requires identifying some
global property of interest and then relating it to the local problem
parameters (*d*,
*b* and
*p* in this case). An important global
property is the search cost given by the number of nodes visited during
the search before concluding there is no solution. For a particular
search instance, the cost to search at and below a given node
*n* in the search tree is given by

where is the k child of node
*n* and if this child is not pruned and 0 otherwise. A node
at depth *d* has no children and hence
a cost of 1. The overall cost is just the value starting from the root
of the search tree.

The value of the search cost will vary among the instances in the
ensemble. One way to characterize the typical behavior is through the
average value of the search cost. In this model, the behavior depends
only on the depth of the node in the tree. Combined with the independent
pruning choices in this model, we have where is the expected cost of a node at depth
*j*. The expected cost of the entire
search, starting from the root, is then

This simple analytic function relates the local heuristic effectiveness
(through the value of *z*) to the
global cost, on average. However, in the limit of infinitely large
problems (i.e., ) we find very different asymptotic behaviors:

indicating an abrupt change in character at the transition point .

Because the individual search instances have different costs, another important aspect of this analysis is to determine how representative the average value is. This can be done by comparing the variance in costs to the average. Starting with

we again get dependence only on depth in the tree so that

and hence (because )

with . Thus for the variance we have

Now so we obtain

Asymptotically, the relative variance is

The relative deviation is just the square root of this quantity. This exhibits a peak at the transition point, as illustrated in Fig. 1. Large fluctuations at the transition point are a typical characteristic of phase transitions. As a consequence, attempts to evaluate behaviors near transitions with simulation experiments often require many more samples to accurately sample the typical range of behaviors than for parameter values far from transition points. This figure also illustrates how the singular nature of the transition in the limit of infinitely large problems arises out of the relatively smooth behavior for small sizes.

**Figure 1:** Relative deviation *r* in the search cost as a function of the effective pruning *z* for b=3. The curves show the behavior for problems of size 20 (gray), 50 (dashed) and 100 (solid), and are indistinguishable except near the transition. Note the increasingly large fluctuations near the transition point at z=1.

In terms of constraint satisfaction problems, *z*;*SPMlt*;1 corresponds to very highly overconstrained problems
so that incorrect choices are pruned almost immediately. This simple
example of a phase transition in a search problem can be extended to
more complex situations where the number of solutions varies or the
heuristic pruning effectiveness varies with depth [55]. As long as the various search choices
are assumed to occur independently, exact results are relatively
straightforward to obtain.

Thu May 16 15:45:43 PDT 1996