State Model \(\mathcal{S}(P)\):
A solution is a sequence of applicable actions that maps \(s_0\) into \(S_G\), and it is optimal if it minimizes sum of action costs (e.g., # of steps)
Different models and controllers obtained by relaxing assumptions in blue …
Search algorithms for planning exploit the correspondence between classical state models \(\mathcal{S}(P)\) and directed graphs:
In the planning as heuristic search formulation, the problem \(P\) is solved by path-finding algorithms over the graph associated with model \(\mathcal{S}(P)\)
Blind search vs. heuristic (or informed) search:
Systematic search vs. local search:
Systematic search algorithms: Considers a large number of search nodes simultaneously; maintains an explicit frontier or open list of states still to explore, and often also a closed list of visited states.
Local search algorithms: Local search usually keeps just one current state (or a small number), and repeatedly moves to a neighbouring state that seems better according to an evaluation function.
This is not a black-and-white distinction; there are crossbreeds (e.g., enforced hill-climbing).
Blind search vs. heuristic search:
Systematic search vs. local search:
We cover a subset of search algorithms most successful in planning. Only some blind search algorithms are covered (refer to Russel & Norvig Chapters 3 and 4).
Search Space for Classical Search
A classical search space is defined by the following three operations:
Search states \(\neq\) world states?
Yes, search states = world states.
No, search states \(\neq\) world states, in fact search states = sets of world states represented as conjunctive sub-goals.
We consider progression in the entire course, unless explicitly stated otherwise.
We use ‘\(s\)’ to denote world and search states interchangeably
What is in a search node?
Different search algorithms store different information in a search node \(\sigma\), but typical information includes:
For the root node, \(\text{parent}(\sigma)\) and \(\text{action}(\sigma)\) are undefined.
Guarantees:
Computational Complexity:
Typical state space features governing complexity:
Blind search does not require any input beyond the problem.
Pros and Cons?
Pros: No additional work for the programmer.
Cons: It’s not called “blind” for nothing … it uses the same expansion order regardless what the problem actually is; it is rarely effective in practice.
Compare with Informed search, which requires as additional input of a heuristic function \(h\) (we will cover in next module) that maps states to estimates of their goal distance.
Pros and Cons?
Pro: Typically more effective in practice.
Con: Requires coming up with and implementation of \({\color{blue}h}\).
In classical planning, \(h\) is generated automatically from the declarative problem description.
Blind search strategies we’ll cover:
Breadth-first search. Advantage: time complexity.
Variant: Uniform cost search.
Depth-first search. Advantage: space complexity.
Iterative deepening search. Combines advantages of breadth-first search and depth-first search. Uses depth-limited search as a sub-procedure.
Width-based search, in particular Iterated Width (IW)
Blind search strategy we won’t cover:
Strategy: Expand nodes in the order they were produced (FIFO frontier).
Guarantees?: A) Complete and optimal B) Complete but may not be optimal C) Optimal but may not be complete D) Neither complete nor optimal
Say that \(b\) is the maximal branching factor, and \(d\) is the goal depth (depth of the shallowest goal state).
What is the upper bound on the number of generated nodes?
Generated nodes at each layer: \(b + b^{2} + b^{3} + \cdots + b^{d}\): In the worst case, the algorithm generates all nodes in the first \(d\) layers.
So the time complexity is \(O(b^{d})\).
And what if we were to apply the goal test at node-expansion time, rather than node-generation time?
Space Complexity: Same as time complexity, since all generated nodes are kept in memory.
Settings: \(b = 10\); \(10{,}000\) nodes/second; \(1{,}000\) bytes/node.
Yields data: by inserting values into equations from
| Depth | Nodes | Time | Memory |
|---|---|---|---|
| 2 | 110 | 0.11 ms | 107 KB |
| 4 | 11,110 | 11 ms | 10.6 MB |
| 6 | \(10^{6}\) | 1.1 s | 1 GB |
| 8 | \(10^{8}\) | 2 min | 103 GB |
| 10 | \(10^{10}\) | 3 h | 10 TB |
| 12 | \(10^{12}\) | 13 days | 1 PB |
| 14 | \(10^{14}\) | 3.5 years | 99 PB |
Which is the worse problem - time or memory?
Strategy: Expand the most recent nodes, LIFO frontier (left to right, top to bottom)
Illustration: Nodes at depth 3 are assumed to have no successors
Guarantees?
A) Complete and optimal
B) Complete but may not be optimal
C) Optimal but may not be complete
D) Neither complete nor optimal
Optimality? No. After all, the algorithm just chooses some direction and hopes for the best (Depth-first search is a way of “hoping to get lucky”).
Completeness? No, because search branches may be infinitely long — there is no cycle check along a branch.
Depth-first search is complete when the state space is acyclic, e.g. in constraint satisfaction problems. If we add cycle checking, it becomes complete for finite state spaces.*
Complexity?
Space:
Time:
If there are paths of length \(m\) in the state space, up to \(O(b^m)\) nodes can be generated - this can happen even if solutions exist at depth\(1\).
“Iterative Deepening Search = Keep doing the same work over again until you find a solution”
Guarantees:
Optimality? - Yes, for uniform costs.
Completeness? - Yes.
Time Complexity?
| Breadth-First Search | \(b + b^{2} + \cdots + b^{d-1} + b^{d} \in {\color{blue}O(b^d)}\) |
| Iterative Deepening Search | \((d)b + (d-1)b^{2} + \cdots + 3b^{d-2} + 2b^{d-1} + 1b^{d} \in {\color{blue}O(b^d)}\) |
Example: \(b = 10,\ d = 5\)
| Breadth-First Search | \(10 + 100 + 1{,}000 + 10{,}000 + 100{,}000 = 111{,}110\) |
| Iterative Deepening Search | \(50 + 400 + 3{,}000 + 20{,}000 + 100{,}000 = 123{,}450\) |
IDS combines the advantages of breadth-first and depth-first search. It is the preferred blind search method in large state spaces with unknown solution depth.
Planning is computationally complex in the worst case
Question:
Can we explain why planners perform well?
A width of planning exponential in problem width goes a long way to explaining problem difficulty:
Limitations of serialisation?
Definition: Novelty
The novelty \(w(s)\) of a state \(s\) is the size of the smallest subset of atoms (boolean variables or facts) in \(s\) that is true for the first time in the search.
Algorithm
Properties
\(IW(k)\) expands at most \(O(n^k)\) states, where \(n\) is the number of atoms.
Our research group tested domains from previous International Planning Competitions (IPCs) benchmark problems.
For each instance with \(N\) goal atoms, we created \(N\) instances with a single goal.
IPC results are remarkably good:
| # Instances | \(IW\) | \(ID\) | \(BRFS\) | \(GBFS + h_{add}\) |
|---|---|---|---|---|
| 37921 | 91% | 24% | 23% | 91% |
Properties
For problems \(\Pi \in \mathcal{P}\) where \(width(\Pi)=k\):
Theorem
Blocks, Logistics, Gripper, and \(n\)-puzzle have a bounded width independent of problem size and initial situation, provided that goals are single atoms.
In practice, \(IW(k \le 2)\) solves 88.3% of IPC problems with single goals:
| # Instances | \(k=1\) | \(k=2\) | \(k>2\) | Total |
|---|---|---|---|---|
| 37921 | 37.0% | 51.3% | 11.7% | 88.3% |
Primary question: \(IW\) solves atomic (single atom) goals — how do we extend the blind procedure to multiple atomic goals?
A simple way to use \(IW\) for solving real benchmarks \(P\) with joint goals is a simple form of hill climbing over the goal set \(G\) with \(|G|=n\)
\(SIW\) uses \(IW\) for both decomposing a problem into subproblems and solving subproblems.
It’s a blind search procedure, as \(IW\) does not even know the next goal \(G_i\) to achieve.
Blind \(SIW\) is better than \(GBFS\)
\(IW\): is essentially a sequence of novelty-based pruned breadth-first searches
\(SIW\): is essentially \(IW\) serialised, used to attain top goals one-by-one
Heuristic search algorithms are the most common and overall most successful algorithms for classical planning.
Greedy best-first search.
Weighted A*.
A*.
IDA*, depth-first branch-and-bound search, breadth-first heuristic search, …
Hill-climbing.
Enforced hill-climbing.
Other algorithms include beam search, tabu search, genetic algorithms, simulated annealing, etc.
A heuristic function \(h\) estimates the cost of an optimal path to the goal
Search gives a preference to explore states with small \(h\).
Heuristic searches require a heuristic function to estimate remaining cost
Definition: Heuristic Function
Let \(\Pi\) be a planning problem with state space \(\Theta_\Pi\).
A heuristic function, or heuristic, for \(\Pi\) is a function \(h : S \mapsto \mathbb{R}^+_0 \cup \{\infty\}\)
Its value \(h(s)\) for state \(s\) is referred to as the state’s heuristic value, or just \(h\)-value
Definition: Remaining Cost, \(h^{*}\)
Let \(\Pi\) be a planning problem with state space \(\Theta_\Pi\).
For a state \(s \in S\), the state’s remaining cost is the cost of an optimal plan for \({\color{blue}s}\), or \({\color{blue}\infty}\) if there exists no plan for \(s\).
The perfect heuristic for \(\Pi\), written \(\color{blue}{h^{*}}\), assigns every \(s \in S\) its remaining cost as the heuristic value.
What does it mean to estimate remaining cost?
For many heuristic search algorithms, \(h\) does not need to have any properties for the algorithm to work (meaning being correct and complete).
Search performance depends crucially on how well \(h\) reflects \(h^{*}\)
For some search algorithms, such as \(A^{*}\), the relationship between formal quality properties of \(h\) and search efficiency can be proven (number of expanded nodes).
For other search algorithms, “it works well in practice” is often as good an analysis as one gets.
We will analyse in a later Module detail approximations to one particularly important heuristic function in planning: \(h^+\).
Are there other properties of \(h\) that search performance crucially depends on?
What about edge cases?
Trade-off: informedness versus computational overhead
A successful heuristic search requires a good trade-off between \(h\)’s informedness and the computational overhead of computing it.
Definition: Heuristic Function Properties
Let \(\Pi\) be a planning problem with state space \(\Theta_\Pi=(S,L,c,T,I,S^G)\), and let \(h\) be a heuristic for \(\Pi\).
What are the relationship between these properties?
Greedy Best-First Search (with duplicate detection)
\(open :=\) new priority queue ordered by ascending \(h(state(\sigma))\)
\(open.\text{insert(make-root-node}(init()))\)
\(closed := \emptyset\)
while not \(open.empty()\):
\(\sigma := open.\text{pop-min}()\) /* get best state */
if \(state(\sigma) \notin closed\): /* check for duplicates */
\(closed := closed \cup \{state(\sigma)\}\) /* add state to closed set */
if \(is\text{-}goal(state(\sigma))\): return \(\text{extract-solution}(\sigma)\)
for each \((a,s') \in succ(state(\sigma))\): /* expand state */
\(\sigma' := \text{make-node}(\sigma,a,s')\)
if \(h(state(\sigma')) < \infty\): \(open.\text{insert}(\sigma')\)
return unsolvable
Completeness?
Optimality?
No. Even for perfect heuristics!
Invariant under all strictly monotonic transformations of \(h\) (e.g., scaling by a positive constant or adding a constant).
Depending upon where you do duplicate detection in the Greedy Best First Search (GBFS) loop, it can make GBFS appear as an A\(^*\) variant as follows:
\(A^*\) (with duplicate detection and re-opening)
\(\textit{open} :=\) new priority queue ordered by ascending \(g(state(\sigma)) + h(state(\sigma))\)
\(\textit{open}\text{.insert(make-root-node}({\color{blue}\text{init}()}))\)
\(closed := \emptyset\)
\(best\text{-}g := \emptyset\) /* maps states to non-negative real numbers */
while not \(\textit{open}.\text{empty}()\):
\(\sigma := open.\text{pop-min}()\)
if \(state(\sigma) \notin closed\) or \(g(\sigma) < best\text{-}g(state(\sigma))\):
/* Check duplicates: re-open if better \(g\) (note that all \(\sigma'\) with same state but worse \(g\)
are behind \(\sigma\) in \(open\), and will be skipped when their turn comes) */
\(closed := closed \cup \{state(\sigma)\}\)
\(best\text{-}g(state(\sigma)) := g(\sigma)\)
if \({\color{blue}\text{is-goal}}(state(\sigma))\): return \(extract\text{-}solution(\sigma)\)
for each \((a,s') \in {\color{blue}\text{succ}}(state(\sigma))\):
\(\sigma' := \text{make-node}(\sigma,a,s')\)
if \(h(state(\sigma')) < \infty\): \(open.insert(\sigma')\)
return unsolvable
\(f\)-value of a state: defined by \(f(s) = g(s) + h(s)\).
Generated nodes: Nodes inserted into \(open\) priority queue at some point.
Expanded nodes: Nodes \(\sigma\) popped from \(open\) priority queue, for which the test against the \(closed\) set and the distance succeeds.
Re-expanded nodes: Expanded nodes for which \(state(\sigma) \in closed\) upon expansion (also called re-opened nodes).
Completeness?
Optimal?
Question
If we set \(h(s) := 0\) for all \(s\), what does \(A^{*}\) become?
(A) Breadth-first search
(C) Uniform-cost search
(B) Depth-first search
(D) Depth-limited search
Recall that uniform-cost search is essentially Dijkstra (a best-first search that always expands the frontier node with the lowest path cost so far).
Weighted \(A^*\) (with duplicate detection and re-opening)
\(open :=\) new priority queue ordered by ascending \(g(state(\sigma)) + {\color{blue}W}\, h(state(\sigma))\)
\(open.\text{insert(make-root-node}({\color{blue}init()}))\)
\(closed := \emptyset\)
\(best\text{-}g := \emptyset\)
while not \(open.\text{empty}()\):
\(\sigma := open.\text{pop-min}()\)
if \(state(\sigma) \notin closed\) or \(g(\sigma) < best\text{-}g(state(\sigma))\):
\(closed := closed \cup \{state(\sigma)\}\)
\(best\text{-}g(state(\sigma)) := g(\sigma)\)
if \(is\text{-}goal(state(\sigma))\): return \(\text{extract-solution}(\sigma)\)
for each \((a,s') \in succ(state(\sigma))\):
\(\sigma' := \text{make-node}(\sigma,a,s')\)
if \(h(state(\sigma')) < \infty\): \(open.\text{insert}(\sigma')\)
return unsolvable
The weight \(W \in \mathbb{R}^+_0\) is an algorithm parameter:
For \(W > 1\), weighted \(A^*\) is bounded suboptimal\(^*\)
\({\color{blue}^*}\)Bounded suboptimal means that the algorithm may return a non-optimal solution, and there’s a proven bound on how much worse it can be than optimal.
Hill-Climbing
\(\sigma := make\text{-}root\text{-}node(init())\)
forever:
if \(is\text{-}goal(state(\sigma))\):
return \(extract\text{-}solution(\sigma)\)
\(\Sigma' := \{\, make\text{-}node(\sigma,a,s') \mid (a,s') \in {\color{blue}succ}(state(\sigma)) \,\}\)
\(\sigma :=\) choose an element of \(\Sigma'\) minimising \(h\) \({\color{blue}\text{/* random tie breaking */}}\)
Is this complete or optimal?
Enforced Hill-Climbing: Procedure \(improve\)
def \(improve(\sigma_0):\)
\(queue :=\) new FIFO queue
\(queue.\text{push-back}(\sigma_0)\)
\(closed := \emptyset\)
while not \(queue.\text{empty}()\):
\(\sigma := queue.\text{pop-front}()\)
if \(state(\sigma) \notin closed\):
\(closed := closed \cup \{state(\sigma)\}\)
if \(h(state(\sigma)) < h(state(\sigma_0))\): return \(\sigma\) /* If better state is found return it */
for each \((a,s') \in {\color{blue}succ}(state(\sigma))\):
\(\sigma' := \text{make-node}(\sigma,a,s')\)
\(queue.\text{push-back}(\sigma')\)
fail
Is essentially breadth-first search for a state with strictly smaller \(h\)-value
Enforced Hill-Climbing
\(\sigma := make\text{-}root\text{-}node(init())\)
while not \(is\text{-}goal(state(\sigma))\):
\(\sigma := improve(\sigma)\)
return \(extract\text{-}solution(\sigma)\)
Is enforced hill-climbing optimal?
Is enforced hill-climbing complete?
In general, no. Under particular circumstances, yes. Assumes that \(h\) is goal-aware.
Procedure \(improve\) fails: no state with strictly smaller \(h\)-value reachable from \(s\), thus (with assumption) goal not reachable from \(s\).
This cannot happen, for example, if the state space is undirected, i.e., if for all transitions \(s \rightarrow s'\) in \(\Theta_{\Pi}\) there is a transition \(s' \rightarrow s\).
| DFS | BrFS | ID | \(A^*\) | HC | IDA* | IW | |
|---|---|---|---|---|---|---|---|
| Complete | No | Yes | Yes | Yes | No | Yes | No |
| Optimal | No | Yes\(^*\) | Yes | Yes | No | Yes | No |
| Time | \(\infty\) | \(b^d\) | \(b^d\) | \(b^d\) | \(\infty\) | \(b^d\) | \(b \cdot n^k\) |
| Space | \(b \cdot d\) | \(b^d\) | \(b \cdot d\) | \(b^d\) | \(b\) | \(b \cdot d\) | \(n^k\) |
Question (Revisited)
If we set \(h(s) := 0\) for all \(s\), what does \(A^{*}\) become?
(A) Breadth-first search
(C) Uniform-cost search
(B) Depth-first search
(D) Depth-limited search
Answer: (C): Same expansion order (details in book-keeping of open/closed states may differ)
Question
If we set \(h(s) := 0\) for all \(s\), what does greedy best-first search become?
(A) Breadth-first search
(C) Uniform-cost search
(B) Depth-first search
(D) A), B) and C)
\(h\) implies no ordering of nodes at all, so this fully depends on how we break ties in the open list. (A): FIFO, (B): LIFO, (C): Order on \(g\). (Details in bookkeeping of open/closed states may differ.)
Question
Is informed search always better than blind search?
(A): Yes.
(B): No.
Answer: (A): Yes and (B): No.
In greedy best-first search, the heuristic may yield larger search spaces than uniform-cost search. E.g., in path planning, say you want to go from Melbourne to Sydney, but \(h(\)Perth\() < h(\)Canberra\()\).
In \(A^{*}\) with an admissible heuristic and duplicate checking, we cannot do worse than uniform-cost search: \(h(s) > 0\) can only reduce the number of states we must consider to prove optimality.
Also, in the above example, doesn’t expand Perth with any admissible heuristic, because \(g(\)Perth\() > g(\)Sydney\()\)!
“Trusting the heuristic” has its dangers! Sometimes \(g\) helps to reduce search.
Distinguish: World states, search states, search nodes.
World state: Situation in the world modelled by the planning problem.
Search state: Subproblem remaining to be solved.
Search node: Search state + information on “how we got there”.
Search algorithms mainly differ in order of node expansion:
Search strategies differ in the order in which they expand search nodes, and in the way they use duplicate elimination.
Criteria for evaluating them include completeness, optimality, time complexity, and space complexity.
Breadth-first search is optimal but uses exponential space;
Depth-first search uses linear space but is not optimal;
Iterative deepening search combines the virtues of both.
Heuristic Functions are estimators for remaining cost.
Heuristic Search Algorithms: