# 程序代写代做代考 database algorithm AI Unit4-Planning

Unit4-Planning

We have concluded the previous lecture with a brief introduction of what a planning problem is,

and we have given a simple example of such a task. In this lecture, we will concentrate more on

planning. There are two main classes of planning, called respectively “planning with certainty”

and “planning under uncertainty”. The first class of planning works under very strong

assumptions for which the agent has access to the full state of the world and has complete

description of effects of actions. Planning under uncertainty, on the other hand, assumes that the

agent has partial knowledge of the state of the world and therefore does not have full knowledge

and certainty on effects of actions when performed. The agent should plan to react to its

environment, and decide what to do depending on its preferences.

In this lecture, we concentrate on planning with certainty. We will, in particular, define the main

components of a planning problem, highlight some of the existing languages used for specifying a

planning problem and introduce different approaches to solve planning problems.

Planning is strictly related to temporal reasoning. So, we will then introduce a specific formalism

called Event Calculus for representing and reasoning about events and effects of events over time,

and we will look at a specific type of planner, called Abductive Event Calculus Planner and show

how the same abductive inference that we have covered in the previous lecture can be used to

solve a planning problem.

We will then conclude with some examples.

The content of this lecture is based on the textbooks: “Artificial Intelligence: Foundations of

Computational Agents”, by Poole & Mackworth, and “Artificial Intelligence: A Modern

Approach” by Russell & Norvik.

1

Planning is about how an agent achieves its goals. An agent does not usually achieve its goals in one

step. What the agent does depends on the state it is in, which, in turn, depends on what it has done in

the past. So, to solve a planning problem, an agent needs a representation of its actions and their effects.

It can then use these models to find a sequence of actions that allow it to achieve its goals.

Given a description of the possible initial states of the world, a description of the desired goals, and a

description of a set of possible actions, with their effects, a planning problem consists of synthesizing a

plan that is guaranteed (when applied to any of the initial states) to generate a state that contains the

desired goals. Such a state is called goal state.

Let’s consider for instance a delivery robot example.

2

We consider here a simplified delivery robot example with four different locations: coffee shop, lab,

mail room and Simon’s office. The robot task is to deliver coffee to Simon’s office, collect mails

from the delivery room and deliver them to Simon’s office.

Formal representation of a planning problem can be either propositional or first-order, depending on

the type of planning approach that is used. Propositional representations are computationally faster,

but they suffer of the problem of exponential explosion of the state space that the planner has to

consider in order to compute valid plans. In this simplified example, we use a propositional

representation.

Each state is expressed as a Boolean combination of four main features: robot has coffee (rhc),

robot has mail (rhm), mail is waiting (mw) and Simon wants coffee (swc), together with the four

possible locations in which the robot might be: robot is at the coffee shop (cs), at Simon’s office

(off), at the mail room (mr) and in the lab (lab).

The robot can perform six actions: move clockwise (mc), move unticlockwise (mcc), pick up a

coffee (puc), deliver the coffee (dc), pick up mail (pum) and deliver mail (dm).

Not all actions can be carried out in every states. Actions have preconditions that specify when they

can be carried out. They have also post-conditions or effects, which specify properties that the

resulting states should satisfy.

For example, the robot can pick up coffee if it is at the coffee shop and it does not have the coffee

already. Similarly, it can deliver coffee if it is holding a coffee and it is in Simon’s office, it can pick

up mails if there are mails waiting and the robot is in the mail room, and finally, it can deliver mails

if it holds mails and it is in john’s office.

Given action descriptions in terms of preconditions and effects, the planning state space could be

expressed by triples defining the current state, actions transitions and new states. But such an

explicit state space representation has limitations. In particular, the number of states grows

exponentially with the number of state features, and the addition of new features would require the

reformulation of the entire state space.

3

Two other approaches can be used to specify a planning problem: (i) a feature-centric approach and

(ii) an action-centric approach. We briefly introduce these two methods in this and the next slide,

respectively.

The feature-centric approach consists of defining, for each feature, two types of rules, causal rules

and frame rules.

A causal rule specifies under what conditions and action performance a feature gets a new value.

For example, in this slide the state “robot holds a coffee” becomes true if the robot performs the

action of picking up a coffee.

A frame rule specifies under what conditions a feature maintains its value, i.e. does not change. For

example, in this slide, the feature of “the robot being at the coffee shop” persists if the root is

currently at the coffee shop and does not move clockwise or unticlockwise.

In a feature-centric approach, specifying a planning problem means specifying, for each feature,

what makes the feature change its value, and what makes it keep its value. So, basically specifying

what changes as well as what does not change.

But note that both causal rules and frame rules do not define when an action can be performed or

not. This is still captured by the notion of precondition of an action.

4

In the action-centric representation of a planning problem, the emphasis is on actions instead of

state-features. The description is more compact and consists of describing, for each action,

precondition and effects. An example of action-centric specification of planning is STRIPS

(Stanford Research Institute Problem Solver), which you have already covered in the first part of

this course. As you have already seen, in this approach, the key assumptions are:

1) features, divided into primitive and derived features. The specification of changes over

derived features are expressed using definite clauses over values of primitive features.

2) primitive features not mentioned in the effects of actions are assumed to remain unchanged.

So only the features that are changed need to be specified in the effects of an action. This is

because the underlying assumption is that most things are left unchanged by a single action.

Therefore, by default, unchanged features are not included in the definition of effects of actions.

So a primitive feature V has value v after an action “act”, if V = v is part of the effect list of the

action act, or if V is not mentioned in the effect list of act, and had already value v immediately

before the action act is performed. Derived features can be derived from the values of the primitive

features at each time.

We have given, in this slide, two examples of action descriptions in STRIPS.

It is easy to see that STRIPS representations can be mapped into a feature-centric representations.

For every feature that appears in the effect list of an action, we can define, in fact, a causal rule in a

corresponding feature-centric representation. Similarly, for every feature not mentioned in the effect

list of an action, we would need to write a frame rule.

5

We have informally introduced the various components of a planning specification. This specification

is used to reason about the state space for possible plans. The search for plan solutions within this

space depends on the type of the particular planning algorithm used and on specific properties of the

planning approach.

In particular, we consider for the rest of this lecture, planning approaches that assume:

• The agent’s actions are deterministic. In this way, the agent knows the effects of actions and

can infer precisely what will be true and/or false after the execution of its actions. Hence, he

knows the consequences of its actions.

• There are no exogenous events that can occur beyond the control of the agent. In this way, if

the agent knows what the current state is and known the consequences of its actions, it will

know what is true and false in the next state.

• The world is fully observable. This means that the agent can observe the current state of the

world and knows at each time point what is true and false in the current state of the world.

• Regarding time points, we also assume that time progresses discretely. Given a current time

(and therefore a current state), the next time point is the next state in which the agent will find

itself after the execution of its action.

• Finally, goals are predicates of states that must be achieved or maintained.

6

A deterministic plan is a sequence of actions that an agent can perform to achieve a goal from a given

starting state. A deterministic planner is therefore a problem solver that produces a plan. The solver takes

in input an initial world description, a specification of the actions available to the agent, and a goal

description.

One of the simplest ways for computing a plan is to treat the planning problem as a path finding in the

state-space graph, where nodes are states, and arcs correspond to actions from one state to another. For

every state in this graph, the coming out arcs are all legal actions that can be carried out in that state. That

is, whose precondition holds in that state and where the resulting state satisfies every given maintenance

goal. A plan is a path from the initial state to a state that satisfies the achievement goal.

A forward planner searches the state-space graph from the initial state looking for a state that satisfies the

achievement goal.

Search algorithms, such as A∗ with multiple-path pruning or iterative deepening, can be used during the

computation of a plan. The complexity of the search space is defined by the forward branching factor of

the graph. The branching factor is the set of all possible actions at any state, which may be quite

large.

When the domain becomes big, the branching factor increases and so the search space explodes. This

complexity can be reduced by finding good heuristics that overcome the combinatorial explosion.

To perform the computation using a forward planner, a node of the state-space can be either a full

complete state description (as partly shown in next slide), or the partial plan (or path) from the initial state

to that node. In the latter case what holds at the current state can be derived using the path and the action

descriptions. Each of these two different representations have pro and cons. The former may be expensive

in the memory space required, whereas the latter maybe computationally expensive in determining, for

instance, loops during the search.

7

This slide shows part of a state-space graph for a forward planner. The graph starts from the given

initial state and shows for each state the actions that can be performed given their respective

preconditions (see slide 3).

8

In the forward planning we have described the state-space searching as a forward search method. But it

is also possible to perform search backward starting from the set of states that satisfy the goal. Whereas

the initial state is usually fully specified and so the search starts off from a single state, the goal does not

usually fully specify a state and so there would be many goal states that satisfy the goal. So searching

from multiple states is not very efficient.

A regression planning is a form of backward search that uses however a different search space. The

nodes in the search space are not states, but sub-goals to be achieved, where each sub-goal is still a set of

assignments to (some of) the features. Arcs are still actions. In particular, an arc from a node g to g′,

labeled with an action act, means act is the last action that is carried out before goal g is achieved, and

the node g’ is the goal that must be true immediately before act so that g is true immediately after act.

The start node is the final goal to be achieved, which is also a conjunction of assignments of values to

state-based features. Basically in a regression planning, the graph start from the goal property that needs

to be achieved and the graph expands considering as next states the sub-goals together with the action

that leads to the goal state. An example is given in this slide.

Suppose the goal is to achieve “not swc”. The start node is [not swc], indicted in the figure with a

overlined swc). The planner than chooses an action that achieves “not swc”. In this case, there is only

one possible action: dc. The preconditions of dc are off ∧ rhc. Thus, there is one arc in the state space

⟨[not swc], [off , rhc]⟩ labeled with dc. Then the planner considers the node [off , rhc]. There are two

actions that can achieve off, namely mc from cs and mcc from lab. There is one action that can achieve

rhc, namely puc. However, puc has as a precondition cs∧ rhc, and if it were applied to get the sub-goal

[off , rhc], would have required a next node [cs∧ not rhc∧ off], which is inconsistent because the

robot cannot be in two different locations at the same time. So the only possible actions are “mc” and

“mcc”, as shown in the slide. And so on, until a node is reached that satisfies a given initial state.

9

One of the key aspect of planning is finding a language that is expressive enough to describe a wide

variety of problems, but restrictive enough to allow efficient algorithms to operate over it. In the

previous slides we have used examples written in STRIPS language. These were mainly expressed in

propositional logic. But more general representations of actions would instead use (typed) predicate

logic.

This slide gives an example of actions specified using an unground representation where the action itself

has variables p, from, to, and pre-conditions and effects are also expressed in terms of these variables.

This representation is called action schema, and all possible instantiations of this action schema

constitute the actual actions. Evaluation of preconditions and effects of actions are then given by

instantiations of the preconditions and effects of the action schemas.

Different planning formalisms have been proposed in the AI community, and to facilitate benchmarking

between different planning algorithms, these formalisms have now been systematized within a standard

syntax called the the Planning Domain Definition Language, or PDDL. PDDL includes sublanguages for

STRIPS, ADL. Since the proposal of PDDL standard language, an international competition for planning

algorithms has been created and run every year together with the top conference on automated planning

and scheduling (ICAP). See for example ICAP 2018 competition on

http : // www. icaps.conference.org/index.php/Main/Competition

10

In 1986, Kowalski and Sergot proposed Event Calculus, a logical framework for representing and

reasoning about events and effects of events over time. It was initially used as a mechanism for

updating databases. But since then Event Calculus has been used and formalised in many different

ways. We consider here a simple Event Calculus (EC) formalisation.

The EC language is based on an ontology consisting of (i) a set of time-point, which is isomorphic to

the non-negative integers, (ii) a set of time-varying properties, called fluents, and a set of events. The

EC language includes the sorted predicates given in this slide, where e is an event variable, f is a fluent

variable and t is a time variable.

The predicate happens(e, t) indicates that event actually occurs at time-point t; the predicate

initiates(e,f,t) (resp. terminates(e,f,t)) means that if event e were to occur at t it would cause fluent f to

be true (resp. false) immediately afterwards. The predicate holdsAt(f,t) indicates that fluent f is true at t.

Fluents could be seen as state-based features that become true or false over time depending of the

occurrence of events and effects that events have on them (i.e. whether they initiate or terminate them).

Events instead are assumed to be atomic, in the sense that their occurrence is instantaneous. Event do

not have a duration.

11

Every Event Calculus description includes two main theories.

(1) a core collection of domain-independent axioms (or rules) that describe general principles for

deciding when fluents hold or do not hold at particular time-points.

(2) a collection of domain-dependent axioms, describing the particular effects of events, using the

predicates initiates(.) and terminates(.). The domain dependent theory includes also axioms stating

the occurrence of specific events at specific time points, using the predicate happens(.), often

referred to as a narrative, and axioms about what is true and false at the initial state, using the

predicate initially(.).

12

We give in this slide the domain-independent axiomatisation of Event Calculus. This is a simplified

version. More complex axiomatisations have been proposed, but the general principles formalised

here are at the core of many axiomatisation of Event Calculus.

The axioms apply to any planning problem domain as they describe general principles of effects of

events and persistency of state-based properties.

The predicate clipped(.) defined in the third axiom is an auxiliary predicate that expresses the

occurrence of an event between two time points that terminates a fluent.

The main characteristic of this axiomatization is that it does not explicitly define when fluents are

false. The “holdsAt” predicate refers only to fluents being true at a particular time point. This is

because the above axiomatization assumes a close world assumption. Any fluent that cannot be

proved to be true at a given time point is assumed to be false. This close world assumption plays an

important role in the definition of this simplified EC axiomatisation. In general, a more elaborated

axiomatisation would be needed that expresses when a fluent is defined to be true and also when it

is defined to be false. The falsity of a fluent can be defined with an axiom that is symmetric to

axiom (ii), and that uses instead a second auxiliary predicate called “declipped”.

Below is an example of the additional axioms that would be needed to define the falsity of fluents:

holdsAt(neg(f),t2) ← happens(a,t1), terminates(a,f,t1), t1 < t2, not declipped(t1,f,t2) declipped(t1,f,t) ← happens(a, t2), initiates(e, f, t2), t1 < t2, t2 < t 13 The second component of an EC formalisation of a planning problem description is the domain- dependent axiomatization. This normally includes: 1) set of rules that specifies the effect of individual actions on state-based fluents. Effects of actions may depends on the current state as well as the occurrence of other actions. One possible assumption that we can make is that only one action can occur at a given time. In this case, effects of actions may simply depend on what is true or false at the current time point when the action is applied. The effect of actions is captured by the definition of initiates and terminates predicates. The body of these rules can therefore include literals with predicates “holdsAt”. 2) definition of the initial state. This is a set of facts fully defining the fluents that are initially true. Fluents that are false at the beginning do not need to be specified, since, by the close world assumption, what is not true is implicitly assumed to be false. 3) integrity constraints on the occurrence of actions may also be included in the domain dependent axiomatisation. 14 Given an Event Calculus formalisation of a planning problem, and a goal, it is possible to define the notion of a plan in terms of ground literals of events that lead the goal, and temporal constraints that these events have to satisfy. The EC formalisation may also include integrity constraints on the occurrence of events. In this case, the set of events that constitute a plan, together with the EC theories, will have to be consistent with the given integrity constraints. It is easy to see that the conditions that a plan has to satisfy matches the conditions of an abductive solution, where the knowledge base is given by the two theories of domain independent and domain dependent EC axiomatisation. An EC planning problem can therefore be reduced to an abductive task where the set of abducibles is given by the set A = {happens(.), < (.)}. The “<“ predicates are over time variables. 15 We consider here a simple shopping trip problem where an agent has to plan a trip to go to different places in order to buy different items. This example is extracted from the Russell&Norvik textbook, where it is instead expressed using the STRIPS language. We show here how we can instead formulate it as an Event Calculus Abductive Planning problem and solve it using a general purpose abductive proof procedure. Three main components need to be defined: 1) Domain Independent Event Calculus theory. The axioms given in slide 13 are represented here. We will consider this simplified version of DI axiomatisation of Event Calculus in our course. Note that the axiomatization given in this slide includes a predicate “time”, to define the sort of type time, and the predicate “<”. This is a “relational operator” for expressing constraints over finite domains. The variables are in this case of type integer and constraints over integer domains can be expressed using the following relational operators: =; =/=; <; =<; >=; >.

16

The second component includes axioms about effects of actions over fluents. We assume that the

problem includes two fluents “at(place)” and “have(item)”, and two actions “goto(place)”,

“buy(item, place)”.

Domain Dependent Event Calculus theory given in this slide defines the effects of these two actions

on the two fluents. Specifically, going to a place initiates the agent being at that place, and terminates

the agents begin at another place (first two axioms). Also, buying an item at a place initiates the agent

having that item as long as the place sells the item and the agent is at that place.

You could easily see how the above axiomatisation corresponds to a STRIPS representations:

goto(x, y)

Precond: at(x)

Effect: ¬at(x) ∧ at(y)

buy(x, store)

Precond: at(store) ∧ sells(store, x)

Effect: have(x)

where the preconditions are captured in the conditions in the “initiates” axioms and the effects are

expressed by the head atoms of the “initiates” and “terminates” rules.

The axiomatisation given in this slide includes also a finite domain membership constraint of the form

X in Min..Max, that specifies the range of the type “time” used in the specification.

17

The third component includes domain specific integrity constraints. These are effective when the

include abducibles, i.e. they express constraints on the possible events that can be assumed to happen

as part of a plan. In our example, we have only one general integrity constraint, expressing the fact

that only one event can happen at each time.

Abducibles are instances of the predicate “happens”, and this predicate is included in the given

integrity constraint.

The full Event Calculus Abductive Planning task is then defined by the tuple

defined so far, and a given goal. An example of a goal is given in this slide.

Ab abductive solution would be in this case be a sequence of events that takes the agent from the

initial state to a state at time point 5 where the agent has both the banana and the drill, as required by

the given goal.

18

The example of EC Abductive Planning that we have considered in the previous slides uses ground goals

and plans are computed so that specific goal states are reached. In real word applications it is not easy to

predict the specific time points in which given goals states are satisfied. A more natural plan query would

be to ask if there is a plan that would take at some time the agent to a state satisfying certain properties.

This type of goal is normally a conjunction of literals with unground time variable. To compute plans the

abductive procedure would have to reason about unground observations.

In the previous lecture we have considered ONLY the abductive proof procedure for ground goals. But

more general abductive procedures exist that take goals which are unground. These more general

algorithms are outside the scope of this course and therefore not examinable, but a brief summary is given

here to allow you to use the abductive system that is available in CATE to solve planning problems. This is

because the abductive procedure in CATE is a more general and powerful abductive engine.

When unground goals are considered to in a given EC Abductive Planning problem the reasoning about

time variables has to keep track of the ordering relations between (skolemised) time variables. The more

abductive proof procedure generates also dynamic constraints to keep try the consistency among unground

(negated) literals. The output of the abductive procedure in this case is a tuple:

(As, Cs, Ns), where As is the list of abduced events; Cs is the list of arithmetic constraints over the time

variables that appear in the abduced events and Ns is the list of dynamically generated constraints, called

dynamic denials. These are of the form: fail(ListOfUniversalVars, ListOfGoals). For example the denial

constraint fail([X], [p(X, Y), q(X), r(Y)]), should be read as

∃Y ∀X. (⊥← p(X, Y ), q(X), r(Y)). All variables in the list of variables are assumed to be universally

quantified and any other variable that appear in the constraint literals are existentially quantified.

Consider, for instance, the same example of planning problem given in the previous slide. And let’s

consider the goal “G = holds(have(banana), T) ∧ holds(have(drill), T).”, which means asking for a plan

that would take the agent to a state where it has bought banana and a drill. A plan solution would be

“happens(goto(hws),_A),happens(buy(drill,hws),_B),happens(goto(sm),_C),happens(buy(banana,sm),_D)”

with arithmetic constraints including Cs = [_D#