Delete relaxation is a method to relax planning tasks and automatically compute a heuristic function \(h\)
Delete relaxation is a widely used heuristic, highly successful for satisficing planning
Note that every \(h\) function typically yields good performance in only in some domains (balancing search reduction versus computational overhead).
Four known types of relaxation that generate heuristics include
“What was once true remains true forever”
Definition: The Delete Relaxation
We denote a STRIPS action \(a^+\), corresponding to \(a\), as the delete relaxed action (or just relaxed action), defined by \(pre_{a^+}:= pre_a\), \(\;add_{a^+}:= add_a\), and \({\color{blue}\;del_{a^+}:=}\) \({\color{blue}\emptyset}\)
A set of actions \(A^+\) corresponds to a set of relaxed actions \(A^+ := \{a^+ \mid a \in A\}\); and for a sequence of actions \(\vec{a} = \langle a_1, \dots, a_n\rangle\) denoted by \(\vec{a}^+\), we denote the corresponding sequence of relaxed actions as \(\vec{a}^+ := \langle a_1^+, \dots, a_n^+\rangle\).
For STRIPS planning task \(\Pi=(F,A,c,I,G)\) we denote the relaxed planning task as \(\Pi^+ := (F,A^+,c,I,G)\).
The “\(^+\)” super-script denotes delete relaxed states encountered in the relaxed version of the problem, e.g. \(s^+\) is a fact set just like \(s\).
Definition: The Relaxed Plan
Let \(\Pi=(F,A,c,I,G)\) be a STRIPS planning task, and let \(s\) be a state.
An optimal relaxed plan for \({\color{red}s}\) is an optimal plan for \({\color{red}\Pi_s^+}\).
A relaxed plan for \(I\) is also called a relaxed plan for \(\Pi\).
Question: Do you recall what \(\Pi_s\) is?:
Initial state: \(\{\mathit{at}(\mathit{Sy}), \mathit{v}(\mathit{Sy})\}\)
Apply \(\mathit{drive}(\mathit{Sy},\mathit{Br})^+\): \(\{\mathit{at}(\mathit{Br}), \mathit{v}(\mathit{Br}),\) \(\mathit{at}(\mathit{Sy}), \mathit{v}(\mathit{Sy})\}\)
Apply \(\mathit{drive}(\mathit{Sy},\mathit{Ad})^+\): \(\{\mathit{at}(\mathit{Ad}), \mathit{v}(\mathit{Ad}),\) \(\mathit{at}(\mathit{Br}), \mathit{v}(\mathit{Br}),\) \(\mathit{at}(\mathit{Sy}), \mathit{v}(\mathit{Sy})\}\)
Apply \(\mathit{drive}(\mathit{Ad},\mathit{Pe})^+\): \(\{\mathit{at}(\mathit{Pe}), \mathit{v}(\mathit{Pe}),\) \(\mathit{at}(\mathit{Ad}), \mathit{v}(\mathit{Ad}),\) \(\mathit{at}(\mathit{Br}), \mathit{v}(\mathit{Br}),\) \(\mathit{at}(\mathit{Sy}), \mathit{v}(\mathit{Sy})\}\)
Apply \(\mathit{drive}(\mathit{Ad},\mathit{Da})^+\): \(\{\mathit{at}(\mathit{Da}), \mathit{v}(\mathit{Da}),\) \(\mathit{at}(\mathit{Pe}), \mathit{v}(\mathit{Pe}),\) \(\mathit{at}(\mathit{Ad}), \mathit{v}(\mathit{Ad}),\) \(\mathit{at}(\mathit{Br}), \mathit{v}(\mathit{Br}),\) \(\mathit{at}(\mathit{Sy}), \mathit{v}(\mathit{Sy})\}\)
Definition: State Dominance
Let \(\Pi^+=(F,A^+,c,I,G)\) be a STRIPS planning task, and let \(s^+\) and \(s^{\prime +}\) be states, we say that state \(s^{\prime +}\) dominates state \(s^+\) if \(s^{\prime +} \supseteq s^+\) (every fact true in \(s^+\) is also true in \(s^{\prime +}\)).
For example, on the previous slide, what state dominates what? Each state along the relaxed plan dominates the previous one, because the actions don’t delete any facts.
Proposition: State Dominance Properties
Let \(\Pi^+=(F,A^+,c,I,G)\) be a delete relaxed STRIPS planning task, and let \(s^+\) and \(s^{\prime +}\) be states such that \(s^{\prime +}\) dominates \(s^+\).
If \(s^+\) is a goal state, then \(s^{\prime +}\) is also a goal state.
If an action sequence \(\vec a^+\) is applicable in \(s^+\) (i.e. \(pre(a^+) \subseteq s^+\) for all \(a^+ \in \vec a^+\)), then it is also applicable in \(s^{\prime +}\), and the state resulting from applying \(\vec a^+\) in \(s^{\prime +}\) denoted \(appl(s^{\prime +} \vec a^+)\), dominates the state resulting from applying \(\vec a^+\) in \(s^+\), \(appl(s^+, \vec a^+)\).
Proof: (i) is trivial. (ii) by induction over the length \(n\) of \(\vec{a}^+\). The base case \(n=0\) is trivial. The inductive case \(n \rightarrow n+1\) follows directly from the induction hypothesis and the definition of \(appl(.,.)\). In a delete=relaxed setting it is always better to have more facts true.
Proposition: Action Dominance
Let \(\Pi=(F,A,c,I,G)\) be a STRIPS planning task, let \(s\) be a state, and let \(a \in A\), then \(appl(s,a^+)\) dominates facts from both (i) \(s\), and (ii) \(appl(s,a)\).
Proof. Trivial from the definitions of \(appl(s,a)\) and \(a^+\).
Proposition: Delete Relaxation is Admissible
Let \(\Pi=(F,A,c,I,G)\) be a STRIPS planning task, let \(s\) be a state, and let \(\vec{a}\) be a plan for \(\Pi_s\). Then \(\vec{a}^+\) is a relaxed plan for \(\Pi_s^+\).
Optimal relaxed plans admissibly estimate the cost of optimal plans
Applying a relaxed action can only ever make more facts true ((i) from Action Dominance proposition above).
this can only be good, i.e. it cannot render a task unsolvable (dominance proposition)
It also means the relaxed plan can only be shorter, and hence is an (admissible) bound on how good the optimal plan could be (never over-estimates).
Proof: Prove by induction over the length of \(\vec{a}\) that \(appl(s,\vec{a}^+)\) dominates \(appl(s,\vec{a})\). Base case is trivial, inductive case follows from (ii) above.
Can you think of a method (greedy is OK) to find a relaxed plan?: We could keep applying relaxed actions, and only stop if the goal is true.
Greedy Relaxed Planning for \(\Pi^+_s\)
\(s^+ := s\); \(\vec{a}^+ := \langle \rangle\)
while \(G \not \subseteq s^+\) do:
if \(\exists a \in A\) such that \(pre_a \subseteq s^+\) and \(appl(s^+,a^+) \neq s^+\) then
select one such \(a\)
\(s^+ := appl(s^+,a^+)\); \(\vec{a}^+ := \vec{a}^+ \circ \langle a^+ \rangle\)
else return “\(\Pi_s^+\) is unsolvable” endif
endwhile
return \(\vec{a}^+\)
Proposition Greedy relaxed planning is sound, complete, and terminates in time polynomial in the size of \(\Pi\).
Proof: Soundness: If \(\vec{a}^+\) is returned then, by construction, \(G \subseteq appl(s,\vec{a}^+)\). Completeness: If “\(\Pi_s^+\) is unsolvable” is returned, then no relaxed plan exists for \(s^+\) at that point; since \(s^+\) dominates \(s\), by the dominance proposition this implies that no relaxed plan can exist for \(s\). Termination: Every \(a \in A\) can be selected at most once because afterwards \(appl(s^+,a^+) = s^+\).
Using greedy relaxed planning to generate \(h\)
In search state \(s\) during forward search, run greedy relaxed planning on \(\Pi_s^+\).
Set \(h(s)\) to the cost of \(\vec{a}^+\), or \(\infty\) if “\(\Pi_s^+\) is unsolvable” is returned.
Is this heuristic safe?:
Is this heuristic goal-aware?
Is this heuristic admissible?
To be informed (accurately estimate \(h^*\)), a heuristic must approximate the minimum effort needed to reach the goal. Greedy relaxed planning cannot be guaranteed to use minimum effort because it may select arbitrary actions that aren’t relevant at all.
Definition: \(h^+\)
Let \(\Pi=(F,A,c,I,G)\) be a STRIPS planning task with state space \(\Theta_\Pi=(S,A,c,T,I,G)\). The optimal delete relaxation heuristic \(h^+\) for \(\Pi\) is the function \(h^+ : S \mapsto \mathbb{R}_0^+ \cup \{\infty\}\) where \(h^+(s)\) is defined as the cost of an optimal relaxed plan for \(s\).
Corollary: \(h^+\) is Admissible
Let \(\Pi=(F,A,c,I,G)\) be a STRIPS planning task. Then \(h^+\) is admissible, and thus safe and goal-aware.
\(h^+\) naturally approximates the minimum effort by asking for the cheapest possible relaxed plans.
You might rightfully ask: But won’t optimal relaxed plans usually under-estimate \(h^*\)?
Yes, but that is just the definition of admissibility in the relaxed problem: it must never over estimates the true cost of the problem.
Arbitrarily adding actions greedily that are useless within the relaxation does not help to address it.

\(P\): \(\mathit{at}(X)\) for \(X \in \{\mathit{Sy}, \mathit{Ad}, \mathit{Br}, \mathit{Pe}, \mathit{Ad}\}\); \(\mathit{v}(X)\) for \(X \in \{\mathit{Sy}, \mathit{Ad}, \mathit{Br}, \mathit{Pe}, \mathit{Ad}\}\).
\(A\): \(\mathit{drive}(X,Y)\) where \(X,Y\) have a road. \[ c(\mathit{drive}(X,Y)) = \left \{ \begin{array}[]{ll} 1 & \{X,Y\} = \{\mathit{Sy}, \mathit{Br}\}\\ 1.5 & \{X,Y\} = \{\mathit{Sy}, \mathit{Ad}\}\\ 3.5 & \{X,Y\} = \{\mathit{Ad}, \mathit{Pe}\}\\ 4 & \{X,Y\} = \{\mathit{Ad}, \mathit{Da}\} \end{array} \right. \]
\(I\): \(\mathit{at}(\mathit{Sy}), \mathit{v}(\mathit{Sy})\); \(G\): \(\mathit{at}(\mathit{Sy}), \mathit{v}(X) \text{ for all } X\).
Planning versus Relaxed Planning:
Optimal plan: \(\langle \mathit{drive}(\mathit{Sy},\mathit{Br})\), \(\mathit{drive}(\mathit{Br},\mathit{Sy})\), \(\mathit{drive}(\mathit{Sy},\mathit{Ad})\), \(\mathit{drive}(\mathit{Ad},\mathit{Pe})\), \(\mathit{drive}(\mathit{Pe},\mathit{Ad})\), \(\mathit{drive}(\mathit{Ad},\mathit{Da})\), \(\mathit{drive}(\mathit{Da},\mathit{Ad})\), \(\mathit{drive}(\mathit{Ad},\mathit{Sy}) \rangle\).
Optimal relaxed plan: \(\langle \mathit{drive}(\mathit{Sy},\mathit{Br})\), \(\mathit{drive}(\mathit{Sy},\mathit{Ad})\), \(\mathit{drive}(\mathit{Ad},\mathit{Pe})\), \(\mathit{drive}(\mathit{Ad},\mathit{Da}) \rangle\).
\({\color{blue}h^{*}(I)=}\) \(20\); \({\color{blue}h^+(I)=}\) \(10\).
\(h^+\)(Hanoi) \(= O(n)\), as opposed to problem complexity of \(O(2^n)\) (the shortest plan which solves the problem can be exponential)
Definition: Optimal Relaxed Planning
By PlanOpt\(^+\), we denote the problem of deciding, given a STRIPS planning task \(\Pi = (F,A,c,I,G)\) and \(B \in \mathbb{R}^+_0\), whether there exists a relaxed plan for \(\Pi\) whose cost is at most \(B\).
Note. By computing \(h^+\), we would solve PlanOpt\(^+\).
Theorem: Optimal Relaxed Planning is Hard.
PlanOpt\(^+\) is NP-complete.
Proof. Membership: Guess action sequences of length \(|A|\) — in a relaxed plan, each action is applied at most once!
Hardness: by reduction from SAT.
For each variable \(v_i \in \{v_1, \dots, v_m\}\) in the CNF, three facts \(\mathit{v}_i\), \(\mathit{notv}_i\), and \(\mathit{setv}_i\); for each clause \(c_j \in \{c_1, \dots, c_n\}\) in the CNF, one fact \(\mathit{satc}_j\).
Actions {\(\mathit{setvtrue}_i : (\emptyset,\{\mathit{v}_i,\mathit{setv}_i\},\emptyset)\)} and {\(\mathit{setvfalse}_i : (\emptyset,\{\mathit{notv}_i,\mathit{setv}_i\},\emptyset)\)}.
Actions {\(\mathit{makesatc}_j : (\{\mathit{v}_i\},\{\mathit{satc}_j\},\emptyset)\)} where \(v_i\) appears positively in clause \(c_j\); and {\((\{\mathit{notv}_i\},\{\mathit{satc}_j\},\emptyset)\)} where \(v_i\) appears negatively in clause \(c_j\).
Initial state \(\emptyset\), goal \(\{\mathit{setv}_1, \dots, \mathit{setv}_m, \mathit{satc}_1, \dots, \mathit{satc}_n\}\); {\(B := m+n\)}.
where, for all \(\Pi \in \mathcal P\), \(h^*(r(\Pi)) \leq h^*(\Pi)\).
For \(h^+ = h^* \circ r\):
Problem \(\mathcal P\): All STRIPS planning tasks.
Simpler problem \(\mathcal P'\): All STRIPS planning tasks with empty deletes.
Perfect heuristic \(h^{\prime *}\) for \(\mathcal P'\): Optimal plan cost \(= h^*\) on \(\mathcal P'\).
Transformation \(r\): Drop the deletes.
Is this a native relaxation? Yes.
Is this relaxation efficiently constructible? Yes.
Is this relaxation efficiently computable? No.
Reminder: What if \(\mathcal R\) is not efficiently computable?
Either (a) approximate \(h^{\prime *}\), or (b) design \(h^{\prime *}\) in a way so that it will typically be feasible, or (c) just live with it and hope for the best.
Many known relaxations (in planning) are efficiently computable, some aren’t (like \(h^+\)). The latter use (a); (b) and (c) are not used anywhere right now.
The delete relaxation heuristic we want is \(h^+\).
All implemented systems are using the delete relaxation approximate \(h^+\) in one or the other way.
Question
In this domain, what is the delete relaxed heuristic \(h^+\) equal to?
(A) Manhattan distance.
(C) Horizontal distance.
(B) \(h^*\).
(D) Vertical distance.
(A): No, relaxed plans can’t walk through walls.
(B): Yes, optimal plan = shortest path = relaxed plan (deletes do not matter because “shortest paths never walk back”).
(C), (D): No, relaxed plans must move both horizontally and vertically.
Definition: \(h^{add}\)
Let \(\Pi=(F,A,c,I,G)\) be a STRIPS planning task. The additive heuristic \({\color{blue}h^{add}}\) for \(\Pi\) is the function \(h^{add}(s) := h^{add}(s,G)\) where \(h^{add}(s,g)\) is the point-wise greatest function that satisfies
\[ \color{blue} h^{add}(s,g) = \left \{ \begin{array}[]{ll} 0 & g \subseteq s\\ min_{a \in A, g \in add_a} c(a) + h^{add}(s,pre_a) & |g| = 1\\ {\color{red}\sum_{g' \in g}} h^{add}(s,\{g'\}) & |g| > 1 \end{array} \right. \]
Definition: \(h^{max}\)
Let \(\Pi=(F,A,c,I,G)\) be a STRIPS planning task. The max heuristic \({\color{blue}h^{max}}\) for \(\Pi\) is the function \(h^{max}(s) := h^{max}(s,G)\) where \(h^{max}(s,g)\) is the point-wise greatest function that satisfies
\[ \color{blue} h^{max}(s,g) = \left \{ \begin{array}[]{ll} 0 & g \subseteq s\\ min_{a \in A, g \in add_a} c(a) + h^{max}(s,pre_a) & |g| = 1\\ {\color{red}\max_{g' \in g}} h^{max}(s,\{g'\}) & |g| > 1 \end{array} \right. \]
Proposition: \(h^{max}\) is Optimistic
\(h^{max} \leq h^+\), and thus \(h^{max} \leq h^*\).
Proposition: \(h^{add}\) is Pessimistic
For all STRIPS planning tasks \(\Pi\), \(h^{add} \geq h^+\), there exist \(\Pi\) and \(s\) such that \(h^{add}(s) > h^*(s)\).
Both \(h^{max}\) and \(h^{add}\) approximate \(h^{add}\) by assuming that singleton sub-goal facts are achieved independently.
\({\color{blue}h^{max}}\) estimates optimistically by the most costly singleton sub-goal
\({\color{blue}h^{add}}\) estimates pessimistically by summing over all singleton sub-goals
Proposition: \(h^{max}\) and \(h^{add}\) Agree with \(h^+\) on \(\infty\)
For all STRIPS planning tasks \(\Pi\) and states \(s\) in \(\Pi\), \(h^+(s) = \infty\) if and only if \(h^{max}(s) = \infty\) if and only if \(h^{add}(s) = \infty\).
States for which no relaxed plan exists are easy to recognise, and that is done by both \(h^{max}\) and \(h^{add}\).
Bellman-Ford variant computing \(h^{add}\) for state \(s\)
Create a new table \(T^{\text{\textup{add}}}_{0}(g)\), for \(g \in F\)
For all \(g \in F: T^{\text{\textup{add}}}_{0}(g) := \left \{ \begin{array}[]{ll} 0 & g \in s \\ \infty & \text{otherwise}\end{array} \right.\)
fn \(c_i(g) := \left \{ \begin{array}[]{ll} T^{\text{\textup{add}}}_{i}(g) & |g| = 1\\ {\color{blue}\sum}_{g' \in g} T^{\text{\textup{add}}}_{i}(g') & |g| > 1 \end{array}\right.\)
fn \(f_i(g) := \min[c_i(g), \min_{a \in A, g \in add_a} c(a) + c_i(pre_a)]\)
do = forever:
new table \(T^{\text{\textup{add}}}_{i+1}(g)\), for \(g \in F\)
For all \(g \in F\): \(T^{\text{\textup{add}}}_{i+1}(g) := f_i(g)\)
if \(T^{\text{\textup{add}}}_{i+1} = T^{\text{\textup{add}}}_{i}\) then stop endif /* least fixed point achieved, so stop */
\(i\) := \(i+1\)
enddo
The same algorithm works for \(h^{max}\) as well, just change \({\color{blue}\sum}\) to \({\color{blue}max}\)
Proposition: Convergence
Let \(\Pi=(F,A,c,I,G)\) be a STRIPS planning task, then the series \(\{T^{\text{\textup{add}}}_i(g)\}_{i=0, \dots}\) converges to \(h^{add}(s,g)\), for all \(g\). (Proof omitted.)

\(P\): \(\mathit{at}(X)\) for \(X \in \{\mathit{Sy}, \mathit{Ad}, \mathit{Br}, \mathit{Pe}, \mathit{Ad}\}\); \(\mathit{v}(X)\) for \(X \in \{\mathit{Sy}, \mathit{Ad}, \mathit{Br}, \mathit{Pe}, \mathit{Ad}\}\).
\(A\): \(\mathit{drive}(X,Y)\) where \(X,Y\) have a road. \[ c(\mathit{drive}(X,Y)) = \left \{ \begin{array}[]{ll} 1 & \{X,Y\} = \{\mathit{Sy}, \mathit{Br}\}\\ 1.5 & \{X,Y\} = \{\mathit{Sy}, \mathit{Ad}\}\\ 3.5 & \{X,Y\} = \{\mathit{Ad}, \mathit{Pe}\}\\ 4 & \{X,Y\} = \{\mathit{Ad}, \mathit{Da}\} \end{array} \right. \]
\(I\): \(\mathit{at}(\mathit{Sy}), \mathit{v}(\mathit{Sy})\); \(G\): \(\mathit{at}(\mathit{Sy}), \mathit{v}(X) \text{ for all } X\).
Content of Tables \(T^{add}_i\)?: where \(T^{add}_i\) denotes table entry after \(i\)-th Bellman-Ford iteration?
\({\color{red}h^{max}(I)=}\) \(5.5 << 20 = h^*(I)\).

\(P\): \(\mathit{at}(X)\) for \(X \in \{\mathit{Sy}, \mathit{Ad}, \mathit{Br}, \mathit{Pe}, \mathit{Ad}\}\); \(\mathit{v}(X)\) for \(X \in \{\mathit{Sy}, \mathit{Ad}, \mathit{Br}, \mathit{Pe}, \mathit{Ad}\}\).
\(A\): \(\mathit{drive}(X,Y)\) where \(X,Y\) have a road. \[ c(\mathit{drive}(X,Y)) = \left \{ \begin{array}[]{ll} 1 & \{X,Y\} = \{\mathit{Sy}, \mathit{Br}\}\\ 1.5 & \{X,Y\} = \{\mathit{Sy}, \mathit{Ad}\}\\ 3.5 & \{X,Y\} = \{\mathit{Ad}, \mathit{Pe}\}\\ 4 & \{X,Y\} = \{\mathit{Ad}, \mathit{Da}\} \end{array} \right. \]
\(I\): \(\mathit{at}(\mathit{Sy}), \mathit{v}(\mathit{Sy})\); \(G\): \(\mathit{at}(\mathit{Sy}), \mathit{v}(X) \text{ for all } X\).
Content of Tables \(T^{add}_i\)?: where \(T^{add}_i\) denotes table entry after \(i\)-th Bellman-Ford iteration?
\({\color{red}h^{add}(I)=}\) \(1.5 + 1 + 5 + 5.5 = 13 > 10 = h^+(I)\). But \(< 20 = h^*(I)\).
\(h^{add}(I) > h^+(I)\) because it counts the cost of \(\mathit{drive}(\mathit{Sy},\mathit{Ad})\) 3 times:
As part of \(h^{add}(I,\{\mathit{v}(\mathit{Ad})\})\), \(h^{add}(I,\{\mathit{v}(\mathit{Pe})\})\), and \(h^{add}(I,\{\mathit{v}(\mathit{Da})\})\)!

Content of Tables ?: \(T^{add}_i\) entries are denoted in black and \({\color{red}T^{max}_i}\) in red.
\({\color{red} h^{add}(I)=}\) \(7 > h^+(I) = 5\), but \(< 8 = h^*(I)\).
\(h^{add}(I) > h^+(I)\) because? It counts the cost of \(\mathit{dr}(A,B), \mathit{dr}(B,C)\) 2 times, for the two preconditions \(\mathit{p}(T)\) and \(\mathit{t}(D)\) of achieving \(\mathit{p}(D)\).
So, what if \(G = \{\mathit{t}(D), \mathit{p}(D)\}\)? \(h^{add}(I)=\) \(10 > 5 = h^*(I) = h^+(I)\) because now \(\mathit{dr}(A,B), \mathit{dr}(B,C), \mathit{dr}(C,D)\) is counted also as part of the goal \(\mathit{t}(D)\).
Summary of typical issues in practice with \(h^{add}\) and \(h^{max}\):
Both \(h^{add}\) and \(h^{max}\) can be used to approximate \(h^+\) and can be computed reasonably quickly.
\(h^{max}\) is admissible, but is typically far too optimistic.
\(h^{add}\) is not admissible as it is pessimistic, but is typically a lot more informed than \(h^{max}\).
\(h^{add}\) is sometimes better informed than \(h^+\), but can be for the wrong reasons: rather than accounting for deletes, it overcounts by ignoring positive interactions, i.e. sub-plans shared between sub-goals.
For the Logistics example above with goal \(\mathit{t}(d)\), if we have \(100\) packages at \(c\) that need to go to \(d\), what is \(h^{add}(I)\)?
\(703 \gg 203 = h^\ast(I) = h^+(I)\): For every package, a count of \(7\) which includes \(\mathit{dr}(a,b), \mathit{dr}(b,c)\) for getting the package into the truck, and \(\mathit{dr}(a,b), \mathit{dr}(b,c), \mathit{dr}(c,d)\) for getting the truck to \(d\).
First compute a best-supporter function \(bs\), which for every fact \(p \in F\) returns an action that is deemed to be the cheapest achiever of \(p\) (within the relaxation).
Then extract a relaxed plan from that function, by applying it to singleton sub-goals and collecting all the actions.
The best-supporter function can be based directly on \(h^{max}\) or \(h^{add}\), simply selecting an action \(a\) achieving \(p\) that minimizes the sum of \(c(a)\) and the cost estimate for \(pre_a\).
Definition: Best-Supporters from \(h^{max}\) and \(h^{add}\)
Let \(\Pi=(F,A,c,I,G)\) be a STRIPS planning task, and let \(s\) be a state, and \(p\) be sub-goals, such that \(p \in G\).
The \(h^{max}\) supporter function \(bs^{max}_s : \{p \in F \mid 0 < h^{max}(s,\{p\}) < \infty\} \mapsto A\) is defined by \(bs^{max}_s(p) := \arg\min_{a \in A, p \in add_a} \Big( c(a) + h^{max}(s,pre_a)\Big).\)
The \({\color{blue}h^{add}}\) supporter function \(bs^{add}_s: \{p \in F \mid 0 < h^{add}(s,\{p\}) < \infty\} \mapsto A\) is defined by \({\color{red}bs^{add}_s(p) := \arg\min_{a \in A, p \in add_a} \Big( c(a) + h^{add}(s,pre_a)\Big)}\).
Example \(h^{add}\) in Logistics:
Relaxed Plan Extraction for state \(s\) and best-supporter function \(bs\)
\(\mathit{Open} := G \setminus s\); \(\mathit{Closed} := \emptyset\); \(\mathit{RPlan} := \emptyset\)
while \(Open \neq \emptyset\) do:
select \(g \in \mathit{Open}\)
\(\mathit{Open} := \mathit{Open} \setminus \{g\}\); \(\mathit{Closed} := \mathit{Closed} \cup \{g\}\);
\(\mathit{RPlan} := \mathit{RPlan} \cup \{{\color{red}bs(g)}\}\); \(\mathit{Open} := \mathit{Open} \cup (pre_{bs(g)} \setminus (s \cup \mathit{Closed}))\)
endwhile
return \(\mathit{RPlan}\)
Starting with top-level goals, iteratively close open singleton sub-goals by selecting best supporter. The number of iterations bounded by \(|P|\), however with the following requirements:
What if \(g \not \in add_{bs}(g)\)? Doesn’t make sense = requirement (A): \(\forall g, g \in add(bs(g))\), i.e. add effects must include goal
What if \(bs(g)\) is undefined? Runtime error = requirement (B): \(bs(g)\) must be defined
What if the support for \(g\) eventually requires \(g\) itself as a precondition? Then this does not actually yield a relaxed plan = requirement (C): must yield a relaxed plan

Extracting a relaxed plan:
\(bs^{add}_s(\mathit{p}(D)) =\) \(\mathit{ul}(D)\); opens \(\mathit{t}(D), \mathit{p}(T)\).
\(bs^{add}_s(\mathit{t}(D)) =\) \(\mathit{dr}(C,D)\); opens\(\;\mathit{t}(C)\).
\(bs^{add}_s(\mathit{t}(C)) =\) \(\mathit{dr}(B,C)\); opens\(\;\mathit{t}(B)\).
\(bs^{add}_s(\mathit{t}(B)) =\) \(\mathit{dr}(A,B)\); opens nothing.
\(bs^{add}_s(\mathit{p}(T)) =\) \(\mathit{lo}(C)\); opens nothing.
Anything more? No, open goals empty at this point.
\({\color{red}\mathit{h^{\mathit{FF}}(I)=}}\) \(5 = h^+(I) < 7 = h^{add}(I) < 8 = h^*(I)\).
What if \(G = \{\mathit{t}(D), \mathit{p}(D)\}\)? \(h^{\mathit{FF}}(I)=\) \(5 = h^+(I) = h^*(I)\) because relaxed plan extraction selects the drive actions only once. By contrast, \(h^{add}(I) = 10\) overcounts these actions, c.f. Bellman-Ford for \(h^{add}\) in Logistics
For relaxed plan extraction to make sense, it requires a best-supporter function
Definition: Best-Supporter Function
Let \(\Pi=(F,A,c,I,G)\) be a STRIPS planning task, and let \(s\) be a state. A best-supporter function for \(s\) is a partial function \(bs: (F \setminus s) \mapsto A\) such that \(p \in add_a\) whenever \(a = bs(p)\).
The support graph of \(bs\) is the directed graph with vertices \(F \cup A\) and arcs \(\{(p,a) \mid p \in pre_a\} \cup \{(a,p) \mid a = bs(p)\}\).
We say that \(bs\) is closed if \(bs(p)\) is defined for every \(p \in (F \setminus s)\) that has a path to a goal \(g \in G\) in the support graph.
We say that \(bs\) is well-founded if the support graph is acyclic.
“\(p \in add_a\) whenever \(a = bs(p)\)”: Requirement (A).
\(bs\) is closed: Requirement (B).
\(bs\) is well-founded: Requirement (C).
Intuition for (C): Relaxed plan extraction starts at the goals, and chains backwards in the support graph. If there are cycles, then this backchaining may not reach the currently true state \(s\), and thus not yield a relaxed plan.

How to do it (well-founded)
How NOT to do it (not well-founded)
Definition (Revisited): Best-Supporters from \(h^{max}\) and \(h^{add}\)
Let \(\Pi=(F,A,c,I,G)\) be a STRIPS planning task, and let \(s\) be a state, and \(p\) be sub-goals, such that \(p \in G\).
The \(h^{max}\) supporter function \(bs^{max}_s : \{{\color{blue}p \in F \mid 0 < h^{max}(s,\{p\}) < \infty}\} \mapsto A\) is defined by \(bs^{max}_s(p) := \arg\min_{a \in A, p \in add_a} \Big( c(a) + h^{max}(s,pre_a)\Big).\)
The \(h^{add}\) supporter function \(bs^{add}_s: \{p \in F \mid 0 < h^{add}(s,\{p\}) < \infty\} \mapsto A\) is defined by \(bs^{add}_s(p) := \arg\min_{a \in A, p \in add_a} \Big( c(a) + h^{add}(s,pre_a)\Big)\).
Proposition
Let \(\Pi=(F,A,c,I,G)\) be a STRIPS planning task such that, for all \(a \in A\), \(c(a) > 0\). Let \(s\) be a state where \(h^+(s) < \infty\). Then both \(bs^{max}_s\) and \(bs^{add}_s\) are closed well-founded supporter functions for \(s\).
Proof. Since \(h^+(s) < \infty\) implies \(h^{max}(s) < \infty\), it is easy to see that \(bs^{max}_s\) is closed (details omitted).
If \(a = bs^{max}_s(p)\), then \(a\) is the action yielding \(0 < h^{max}(s,\{p\}) < \infty\) in the \(h^{max}\) equation.
Intuition: Since \(c(a) > 0\), we have \(h^{max}(s,pre_a) < h^{max}(s,\{p\})\) and thus, for all \(q \in pre_a\), \(h^{max}(s,\{q\}) < h^{max}(s,\{p\})\).
Transitively, if the support graph contains a path from fact vertex \(r\) to fact vertex \(t\), then \(h^{max}(s,\{r\}) < h^{max}(s,\{t\})\).
Thus there can’t be cycles in the support graph and \(bs^{max}_s\) is well-founded. Similar for \(bs^{add}_s\).
Proposition
Let \(\Pi=(F,A,c,I,G)\) be a STRIPS planning task, let \(s\) be a state, and let \(bs\) be a closed well-founded best-supporter function for \(s\).
Then the action set \(\mathit{RPlan}\) returned by relaxed plan extraction can be sequenced into a relaxed plan \(\vec{a}^+\) for \(s\).
Proof. Order \(a\) before \(a'\) whenever the support graph contains a path from \(a\) to \(a'\). Since the support graph is acyclic, such a sequencing \(\vec{a} := \langle a_1, \dots, a_n \rangle\) exists.
We have \(p \in s\) for all \(p \in pre_{a_1}\), because otherwise \(\mathit{RPlan}\) would contain the action \(bs(p)\), necessarily ordered before \(a_1\).
We have \(p \in s \cup add_{a_1}\) for all \(p \in pre_{a_2}\), because otherwise \(\mathit{RPlan}\) would contain the action \(bs(p)\), necessarily ordered before \(a_2\). Iterating the argument shows that \(\vec{a}^+\) is a relaxed plan for \(s\).
Definition: Relaxed Plan Heuristic
A heuristic function is called a relaxed plan heuristic, denoted \({\color{blue}h^{\mathit{FF}}}\), if, given a state \(s\),
it returns \(\infty\) if no relaxed plan exists, and
otherwise returns \(\sum_{a \in \mathit{RPlan}} c(a)\) where \(\mathit{RPlan}\) is the action set returned by relaxed plan extraction on a closed well-founded best-supporter function for \(s\).
Recall if a relaxed plan exists, then there also exists a closed well-founded best-supporter function, see above.
By comparison:
\(h^{max}\) estimates by taking the hardest sub-goal
\(h^{add}\) estimates by summing sub-goal costs
\(h^{FF}\) estimates by extracting an actual relaxed plan in the delete-relaxed problem and then taking its cost.
Proposition: \(h^{\mathit{FF}}\) is Pessimistic and Agrees with \(h^*\) on \(\infty\)
For all STRIPS planning tasks \(\Pi\), \(h^{\mathit{FF}} \geq h^+\); for all states \(s\), \(h^+(s) = \infty\) if and only if \(h^{\mathit{FF}}(s) = \infty\). There exist \(\Pi\) and \(s\) so that \(h^{\mathit{FF}}(s) > h^*(s)\).
Proof. \(h^{\mathit{FF}} \geq h^+\) follows directly from the previous proposition. Agrees with \(h^+\) on \(\infty\): direct from definition. Inadmissiblity: Whenever \(bs\) makes sub-optimal choices. Exercise, perhaps
Relaxed plan heuristics have the same theoretical properties as \(h^{add}\).
So what’s the point?
Can \(h^{\mathit{FF}}\) over-count, i.e., count sub-plans shared between sub-goals more than once?
No, due to the set union in “\(\mathit{RPlan} := \mathit{RPlan} \cup \{\mathit{bs}(g)\}\)”.
\(h^{\mathit{FF}}\) be inadmissible, just like \(h^{add}\), but for more subtle reasons.
In practice, \(h^{\mathit{FF}}\) typically does not over-estimate \(h^*\) (or not by a large amount, anyway); example on previous “Logistics” slide.
Definition: Helpful Actions
Let \(h^{\mathit{FF}}\) be a relaxed plan heuristic, let \(s\) be a state, and let \(\mathit{RPlan}\) be the action set returned by relaxed plan extraction on the closed well-founded best-supporter function for \(s\) which underlies \(h^{\mathit{FF}}\).
Remarks
Initially introduced in FF [Hoffmann and Nebel (2011)], restricting Enforced Hill-Climbing to use only the helpful actions
Expanding only helpful actions does not guarantee completeness.
Other planners use helpful actions as preferred operators, first expanding nodes resulting from helpful actions.
Question
How does ignoring delete lists simplify FreeCell card game?
(A) You can move all cards immediately to their goal.
(B) Free cells remain free.
(A): No, we don’t get any new moves in the relaxation.
(B): Yes, when putting a card into a free cell, it’s still free for another card.
Question
How does ignoring delete lists simplify Sokoban game?
(A) Free positions remain free
(C) You can push 2 stones to same position.
(B) You can walk through walls
(D) Nothing ever becomes blocked
(A), (C), (D): Yes (similar to FreeCell)
(B): No, we don’t get any new moves
The delete relaxation simplifies STRIPS by removing all delete effects of the actions.
The cost of optimal relaxed plans yields the heuristic function \(h^+\), which is admissible but hard to compute.
We can approximate \(h^+\) optimistically by \(h^{max}\), and pessimistically by \(h^{add}\). \(h^{max}\) is admissible, \(h^{add}\) is not. \(h^{add}\) is typically much more informative, but can suffer from over-counting.
Either of \(h^{max}\) or \(h^{add}\) can be used to generate a closed well-founded best-supporter function, from which we can extract a relaxed plan. The resulting relaxed plan heuristic \(h^{\mathit{FF}}\) does not do over-counting, but otherwise has the same theoretical properties as \(h^{add}\); it typically does not over-estimate \(h^*\).
HSP [Bonet and Geffner, AI-01]
FF [Hoffmann and Nebel, JAIR-01]
LAMA [Richter and Westphal, JAIR-10]
BFWS [Lipovetzky and Geffner, AAAI-17]
The relaxation principle is a generic, widely applicable approach for many different contexts, as are other relaxation principles identified in the subject.
While the \(h^{max}\) heuristic is not very informative in practice, other lower-bounding approximations of \(h^+\) are very important for optimal planning, such as
Admissible landmarks heuristics [Karpas and Domshlak, IJCAI-09], and
LM-cut heuristic [Helmert and Domshlak, ICAPS-09].
It has always been a challenge to take some delete effects into account. Some work has been conducted to interpolate smoothly between \(h^+\) and \(h^*\), such as
Planning as Heuristic Search [Bonet and Geffner, AI-01].
A paper charting the origins of heuristics search.
Available via Institutional Login: https://www.sciencedirect.com/science/article/pii/S0004370201001084
The FF Planning System: Fast Plan Generation Through Heuristic Search [Hoffmann:nebel:jair-01].
The main reference for delete relaxation heuristics.
Available for download at https://fai.cs.uni-saarland.de/hoffmann/papers/jair01.pdf
Semi-Relaxed Plan Heuristics [Keyder, Hoffmann, and Haslum ICAPS-12].
Available at: http://fai.cs.uni-saarland.de/hoffmann/papers/icaps12a.pdf
Computes relaxed plan heuristics within a compiled planning task \(\Pi^C_{\mathit{ce}}\), in which a subset \(C\) of all fact conjunctions in the task is represented explicitly as suggested by [Haslum, ICAPS-12].
\(C\) can in principle always be chosen so that \(h^+_{\Pi^{C}_{\mathit{ce}}}\) is perfect (equals \(h^*\) in the original planning task), so the technique allows to interpolate between \(h^+\) and \(h^*\).
In practice, small sets \(C\) sometimes suffice to obtain dramatically more informed relaxed plan heuristics.