State Model \(\mathcal{S}(P)\):
A solution is a sequence of applicable actions that maps \(s_0\) into \(S_G\), and it is optimal if it minimizes sum of action costs (e.g., # of steps)
Different models and controllers obtained by relaxing assumptions in blue …
Search algorithms for planning exploit the correspondence between classical state models \(\mathcal{S}(P)\) and directed graphs:
In the planning as heuristic search formulation, the problem \(P\) is solved by path-finding algorithms over the graph associated with model \(\mathcal{S}(P)\)
Blind search vs. heuristic (or informed) search:
Systematic search vs. local search:
Systematic search algorithms: Considers a large number of search nodes simultaneously; maintains an explicit frontier or open list of states still to explore, and often also a closed list of visited states.
Local search algorithms: Local search usually keeps just one current state (or a small number), and repeatedly moves to a neighbouring state that seems better according to an evaluation function.
This is not a black-and-white distinction; there are crossbreeds (e.g., enforced hill-climbing).
Blind search vs. heuristic search:
Systematic search vs. local search:
We cover a subset of search algorithms most successful in planning. Only some blind search algorithms are covered (refer to Russel & Norvig Chapters 3 and 4).
Search Space for Classical Search
A classical search space is defined by the following three operations:
Search states \(\neq\) world states?
We consider progression in the entire course, unless explicitly stated otherwise.
We use ‘\(s\)’ to denote world and search states interchangeably
What is in a search node?
Different search algorithms store different information in a search node \(\sigma\), but typical information includes:
For the root node, \(\text{parent}(\sigma)\) and \(\text{action}(\sigma)\) are undefined.
Guarantees:
Computational Complexity:
Typical state space features governing complexity:
Blind search does not require any input beyond the problem.
Pros and Cons?
Compare with Informed search, which requires as additional input of a heuristic function \(h\) (we will cover in next module) that maps states to estimates of their goal distance.
Pros and Cons?
In classical planning, \(h\) is generated automatically from the declarative problem description.
Blind search strategies we’ll cover:
Breadth-first search. Advantage: time complexity.
Variant: Uniform cost search.
Depth-first search. Advantage: space complexity.
Iterative deepening search. Combines advantages of breadth-first search and depth-first search. Uses depth-limited search as a sub-procedure.
Width-based search, in particular Iterated Width (IW)
Blind search strategy we won’t cover:
Strategy: Expand nodes in the order they were produced (FIFO frontier).
Guarantees?: A) Complete and optimal B) Complete but may not be optimal C) Optimal but may not be complete D) Neither complete nor optimal
Say that \(b\) is the maximal branching factor, and \(d\) is the goal depth (depth of the shallowest goal state).
What is the upper bound on the number of generated nodes?
And what if we were to apply the goal test at node-expansion time, rather than node-generation time?
Settings: \(b = 10\); \(10{,}000\) nodes/second; \(1{,}000\) bytes/node.
Yields data: by inserting values into equations from
| Depth | Nodes | Time | Memory |
|---|---|---|---|
| 2 | 110 | 0.11 ms | 107 KB |
| 4 | 11,110 | 11 ms | 10.6 MB |
| 6 | \(10^{6}\) | 1.1 s | 1 GB |
| 8 | \(10^{8}\) | 2 min | 103 GB |
| 10 | \(10^{10}\) | 3 h | 10 TB |
| 12 | \(10^{12}\) | 13 days | 1 PB |
| 14 | \(10^{14}\) | 3.5 years | 99 PB |
Which is the worse problem - time or memory?
Strategy: Expand the most recent nodes, LIFO frontier (left to right, top to bottom)
Illustration: Nodes at depth 3 are assumed to have no successors
Guarantees?
A) Complete and optimal
B) Complete but may not be optimal
C) Optimal but may not be complete
D) Neither complete nor optimal
Optimality?
Completeness?
Complexity?
Space:
Time:
“Iterative Deepening Search = Keep doing the same work over again until you find a solution”
Guarantees:
Optimality?
Completeness?
Space complexity?
Time Complexity?
Planning is computationally complex in the worst case
Question:
Can we explain why planners perform well?
A width of planning exponential in problem width goes a long way to explaining problem difficulty:
Limitations of serialisation?
Definition: Novelty
The novelty \(w(s)\) of a state \(s\) is the size of the smallest subset of atoms (boolean variables or facts) in \(s\) that is true for the first time in the search.
Algorithm
Properties
\(IW(k)\) expands at most \(O(n^k)\) states, where \(n\) is the number of atoms.
Our research group tested domains from previous International Planning Competitions (IPCs) benchmark problems.
For each instance with \(N\) goal atoms, we created \(N\) instances with a single goal.
IPC results are remarkably good:
| # Instances | \(IW\) | \(ID\) | \(BRFS\) | \(GBFS + h_{add}\) |
|---|---|---|---|---|
| 37921 | 91% | 24% | 23% | 91% |
Properties
For problems \(\Pi \in \mathcal{P}\) where \(width(\Pi)=k\):
Theorem
Blocks, Logistics, Gripper, and \(n\)-puzzle have a bounded width independent of problem size and initial situation, provided that goals are single atoms.
In practice, \(IW(k \le 2)\) solves 88.3% of IPC problems with single goals:
| # Instances | \(k=1\) | \(k=2\) | \(k>2\) | Total |
|---|---|---|---|---|
| 37921 | 37.0% | 51.3% | 11.7% | 88.3% |
Primary question: \(IW\) solves atomic (single atom) goals — how do we extend the blind procedure to multiple atomic goals?
A simple way to use \(IW\) for solving real benchmarks \(P\) with joint goals is a simple form of hill climbing over the goal set \(G\) with \(|G|=n\)
\(SIW\) uses \(IW\) for both decomposing a problem into subproblems and solving subproblems.
It’s a blind search procedure, as \(IW\) does not even know the next goal \(G_i\) to achieve.
Blind \(SIW\) is better than \(GBFS\)
\(IW\): is essentially a sequence of novelty-based pruned breadth-first searches
\(SIW\): is essentially \(IW\) serialised, used to attain top goals one-by-one
Heuristic search algorithms are the most common and overall most successful algorithms for classical planning.
Greedy best-first search.
Weighted A*.
A*.
IDA*, depth-first branch-and-bound search, breadth-first heuristic search, …
Hill-climbing.
Enforced hill-climbing.
Other algorithms include beam search, tabu search, genetic algorithms, simulated annealing, etc.
A heuristic function \(h\) estimates the cost of an optimal path to the goal
Search gives a preference to explore states with small \(h\).
Heuristic searches require a heuristic function to estimate remaining cost
Definition: Heuristic Function
Let \(\Pi\) be a planning problem with state space \(\Theta_\Pi\).
A heuristic function, or heuristic, for \(\Pi\) is a function \(h : S \mapsto \mathbb{R}^+_0 \cup \{\infty\}\)
Its value \(h(s)\) for state \(s\) is referred to as the state’s heuristic value, or just \(h\)-value
Definition: Remaining Cost, \(h^{*}\)
Let \(\Pi\) be a planning problem with state space \(\Theta_\Pi\).
For a state \(s \in S\), the state’s remaining cost is the cost of an optimal plan for \({\color{blue}s}\), or \({\color{blue}\infty}\) if there exists no plan for \(s\).
The perfect heuristic for \(\Pi\), written \(\color{blue}{h^{*}}\), assigns every \(s \in S\) its remaining cost as the heuristic value.
What does it mean to estimate remaining cost?
For many heuristic search algorithms, \(h\) does not need to have any properties for the algorithm to work (meaning being correct and complete).
Search performance depends crucially on how well \(h\) reflects \(h^{*}\)
For some search algorithms, such as \(A^{*}\), the relationship between formal quality properties of \(h\) and search efficiency can be proven (number of expanded nodes).
For other search algorithms, “it works well in practice” is often as good an analysis as one gets.
We will analyse in a later Module detail approximations to one particularly important heuristic function in planning: \(h^+\).
Are there other properties of \(h\) that search performance crucially depends on?
What about edge cases?
Definition: Heuristic Function Properties
Let \(\Pi\) be a planning problem with state space \(\Theta_\Pi=(S,L,c,T,I,S^G)\), and let \(h\) be a heuristic for \(\Pi\).
What are the relationship between these properties?
Greedy Best-First Search (with duplicate detection)
\(open :=\) new priority queue ordered by ascending \(h(state(\sigma))\)
\(open.\text{insert(make-root-node}(init()))\)
\(closed := \emptyset\)
while not \(open.empty()\):
\(\sigma := open.\text{pop-min}()\) /* get best state */
if \(state(\sigma) \notin closed\): /* check for duplicates */
\(closed := closed \cup \{state(\sigma)\}\) /* add state to closed set */
if \(is\text{-}goal(state(\sigma))\): return \(\text{extract-solution}(\sigma)\)
for each \((a,s') \in succ(state(\sigma))\): /* expand state */
\(\sigma' := \text{make-node}(\sigma,a,s')\)
if \(h(state(\sigma')) < \infty\): \(open.\text{insert}(\sigma')\)
return unsolvable
Completeness?
Optimality?
Depending upon where you do duplicate detection in the Greedy Best First Search (GBFS) loop, it can make GBFS appear as an A\(^*\) variant as follows:
\(A^*\) (with duplicate detection and re-opening)
\(\textit{open} :=\) new priority queue ordered by ascending \(g(state(\sigma)) + h(state(\sigma))\)
\(\textit{open}\text{.insert(make-root-node}({\color{blue}\text{init}()}))\)
\(closed := \emptyset\)
\(best\text{-}g := \emptyset\) /* maps states to non-negative real numbers */
while not \(\textit{open}.\text{empty}()\):
\(\sigma := open.\text{pop-min}()\)
if \(state(\sigma) \notin closed\) or \(g(\sigma) < best\text{-}g(state(\sigma))\):
/* Check duplicates: re-open if better \(g\) (note that all \(\sigma'\) with same state but worse \(g\)
are behind \(\sigma\) in \(open\), and will be skipped when their turn comes) */
\(closed := closed \cup \{state(\sigma)\}\)
\(best\text{-}g(state(\sigma)) := g(\sigma)\)
if \({\color{blue}\text{is-goal}}(state(\sigma))\): return \(extract\text{-}solution(\sigma)\)
for each \((a,s') \in {\color{blue}\text{succ}}(state(\sigma))\):
\(\sigma' := \text{make-node}(\sigma,a,s')\)
if \(h(state(\sigma')) < \infty\): \(open.insert(\sigma')\)
return unsolvable
\(f\)-value of a state: defined by \(f(s) = g(s) + h(s)\).
Generated nodes: Nodes inserted into \(open\) priority queue at some point.
Expanded nodes: Nodes \(\sigma\) popped from \(open\) priority queue, for which the test against the \(closed\) set and the distance succeeds.
Re-expanded nodes: Expanded nodes for which \(state(\sigma) \in closed\) upon expansion (also called re-opened nodes).
Completeness?
Optimality?
Question
If we set \(h(s) := 0\) for all \(s\), what does \(A^{*}\) become?
(A) Breadth-first search
(C) Uniform-cost search
(B) Depth-first search
(D) Depth-limited search
Recall that uniform-cost search is essentially Dijkstra (a best-first search that always expands the frontier node with the lowest path cost so far).
Weighted \(A^*\) (with duplicate detection and re-opening)
\(open :=\) new priority queue ordered by ascending \(g(state(\sigma)) + {\color{blue}W}\, h(state(\sigma))\)
\(open.\text{insert(make-root-node}({\color{blue}init()}))\)
\(closed := \emptyset\)
\(best\text{-}g := \emptyset\)
while not \(open.\text{empty}()\):
\(\sigma := open.\text{pop-min}()\)
if \(state(\sigma) \notin closed\) or \(g(\sigma) < best\text{-}g(state(\sigma))\):
\(closed := closed \cup \{state(\sigma)\}\)
\(best\text{-}g(state(\sigma)) := g(\sigma)\)
if \(is\text{-}goal(state(\sigma))\): return \(\text{extract-solution}(\sigma)\)
for each \((a,s') \in succ(state(\sigma))\):
\(\sigma' := \text{make-node}(\sigma,a,s')\)
if \(h(state(\sigma')) < \infty\): \(open.\text{insert}(\sigma')\)
return unsolvable
The weight \(W \in \mathbb{R}^+_0\) is an algorithm parameter:
For \(W > 1\), weighted \(A^*\) is bounded suboptimal\(^*\)
\({\color{blue}^*}\)Bounded suboptimal means that the algorithm may return a non-optimal solution, and there’s a proven bound on how much worse it can be than optimal.
Hill-Climbing
\(\sigma := make\text{-}root\text{-}node(init())\)
forever:
if \(is\text{-}goal(state(\sigma))\):
return \(extract\text{-}solution(\sigma)\)
\(\Sigma' := \{\, make\text{-}node(\sigma,a,s') \mid (a,s') \in {\color{blue}succ}(state(\sigma)) \,\}\)
\(\sigma :=\) choose an element of \(\Sigma'\) minimising \(h\) \({\color{blue}\text{/* random tie breaking */}}\)
Is this complete or optimal?
Enforced Hill-Climbing: Procedure \(improve\)
def \(improve(\sigma_0):\)
\(queue :=\) new FIFO queue
\(queue.\text{push-back}(\sigma_0)\)
\(closed := \emptyset\)
while not \(queue.\text{empty}()\):
\(\sigma := queue.\text{pop-front}()\)
if \(state(\sigma) \notin closed\):
\(closed := closed \cup \{state(\sigma)\}\)
if \(h(state(\sigma)) < h(state(\sigma_0))\): return \(\sigma\) /* If better state is found return it */
for each \((a,s') \in {\color{blue}succ}(state(\sigma))\):
\(\sigma' := \text{make-node}(\sigma,a,s')\)
\(queue.\text{push-back}(\sigma')\)
fail
Is essentially breadth-first search for a state with strictly smaller \(h\)-value
Enforced Hill-Climbing
\(\sigma := make\text{-}root\text{-}node(init())\)
while not \(is\text{-}goal(state(\sigma))\):
\(\sigma := improve(\sigma)\)
return \(extract\text{-}solution(\sigma)\)
Is enforced hill-climbing optimal?
Is enforced hill-climbing complete?
| DFS | BrFS | ID | \(A^*\) | HC | IDA* | IW | |
|---|---|---|---|---|---|---|---|
| Complete | No | Yes | Yes | Yes | No | Yes | No |
| Optimal | No | Yes\(^*\) | Yes | Yes | No | Yes | No |
| Time | \(\infty\) | \(b^d\) | \(b^d\) | \(b^d\) | \(\infty\) | \(b^d\) | \(b \cdot n^k\) |
| Space | \(b \cdot d\) | \(b^d\) | \(b \cdot d\) | \(b^d\) | \(b\) | \(b \cdot d\) | \(n^k\) |
Question (Revisited)
If we set \(h(s) := 0\) for all \(s\), what does \(A^{*}\) become?
(A) Breadth-first search
(C) Uniform-cost search
(B) Depth-first search
(D) Depth-limited search
Question
If we set \(h(s) := 0\) for all \(s\), what does greedy best-first search become?
(A) Breadth-first search
(C) Uniform-cost search
(B) Depth-first search
(D) A), B) and C)
Question
Is informed search always better than blind search?
(A): Yes.
(B): No.
Distinguish: World states, search states, search nodes.
World state: Situation in the world modelled by the planning problem.
Search state: Subproblem remaining to be solved.
Search node: Search state + information on “how we got there”.
Search algorithms mainly differ in order of node expansion:
Search strategies differ in the order in which they expand search nodes, and in the way they use duplicate elimination.
Criteria for evaluating them include completeness, optimality, time complexity, and space complexity.
Breadth-first search is optimal but uses exponential space;
Depth-first search uses linear space but is not optimal;
Iterative deepening search combines the virtues of both.
Heuristic Functions are estimators for remaining cost.
Heuristic Search Algorithms: