# CS代考计算机代写 algorithm flex deep learning Bayesian network data structure Bayesian decision tree AI Hidden Markov Mode chain 1

1
INTRODUCTION
CHAPTER

CHAPTER 2 INTELLIGENT AGENTS
function TABLE-DRIVEN-AGENT(percept) returns an action persistent: percepts, a sequence, initially empty
table, a table of actions, indexed by percept sequences, initially fully specified
append percept to the end of percepts action ←LOOKUP(percepts,table) return action
Figure 2.7 The TABLE-DRIVEN-AGENT program is invoked for each new percept and re- turns an action each time. It retains the complete percept sequence in memory.
function REFLEX-VACUUM-AGENT([location,status]) returns an action
if status = Dirty then return Suck else if location = A then return Right else if location = B then return Left
Figure2.8 Theagentprogramforasimplereflexagentinthetwo-locationvacuumenviron- ment. This program implements the agent function tabulated in Figure ??.
function SIMPLE-REFLEX-AGENT(percept) returns an action persistent: rules, a set of condition–action rules
state ← INTERPRET-INPUT(percept) rule ← RULE-MATCH(state, rules) action ← rule.ACTION
return action
Figure 2.10 A simple reflex agent. It acts according to a rule whose condition matches the current state, as defined by the percept.

function MODEL-BASED-REFLEX-AGENT(percept) returns an action persistent: state, the agent’s current conception of the world state
transition model , a description of how the next state depends on the current state and action
sensor model , a description of how the current world state is reflected in the agent’s percepts
rules, a set of condition–action rules action, the most recent action, initially none
state←UPDATE-STATE(state,action,percept,transition model,sensor model) rule ← RULE-MATCH(state, rules)
action ← rule.ACTION
return action
Figure 2.12 A model-based reflex agent. It keeps track of the current state of the world, using an internal model. It then chooses an action in the same way as the reflex agent.
3

CHAPTER 3
SOLVING PROBLEMS BY SEARCHING
function BEST-FIRST-SEARCH(problem, f ) returns a solution node or failure
node ← NODE(STATE=problem.INITIAL)
frontier ← a priority queue ordered by f , with node as an element
reached ← a lookup table, with one entry with key problem.INITIAL and value node while not IS-EMPTY(frontier) do
node ← POP(frontier)
if problem.IS-GOAL(node.STATE) then return node for each child in EXPAND(problem, node) do
s ←child.STATE
if s is not in reached or child.PATH-COST < reached[s].PATH-COST then reached[s]←child add child to frontier return failure function EXPAND(problem,node) yields nodes s ← node.STATE for each action in problem.ACTIONS(s) do s′ ← problem.RESULT(s, action) cost ←node.PATH-COST + problem.ACTION-COST(s,action,s′) yield NODE(STATE=s′, PARENT=node, ACTION=action, PATH-COST=cost) Figure 3.7 The best-first search algorithm, and the function for expanding a node. The data structures used here are described in Section ??. See Appendix B for yield. 5 function BREADTH-FIRST-SEARCH(problem) returns a solution node or failure node ← NODE(problem.INITIAL) if problem.IS-GOAL(node.STATE) then return node frontier ← a FIFO queue, with node as an element reached ← {problem.INITIAL} while not IS-EMPTY(frontier) do node ← POP(frontier) for each child in EXPAND(problem, node) do s ←child.STATE if problem.IS-GOAL(s) then return child if s is not in reached then add s to reached add child to frontier return failure function UNIFORM-COST-SEARCH(problem) returns a solution node, or failure return BEST-FIRST-SEARCH(problem, PATH-COST) Figure 3.9 Breadth-first search and uniform-cost search algorithms. function ITERATIVE-DEEPENING-SEARCH(problem) returns a solution node or failure fordepth =0to∞do result ←DEPTH-LIMITED-SEARCH(problem,depth) if result ̸= cutoff then return result function DEPTH-LIMITED-SEARCH(problem, l) returns a node or failure or cutoff frontier ← a LIFO queue (stack) with NODE(problem.INITIAL) as an element result ← failure while not IS-EMPTY(frontier) do node ← POP(frontier) if problem.IS-GOAL(node.STATE) then return node if DEPTH(node) > l then
result ← cutoff
else if not IS-CYCLE(node) do
for each child in EXPAND(problem, node) do add child to frontier
return result
Figure 3.12 Iterative deepening and depth-limited tree-like search. Iterative deepening re- peatedly applies depth-limited search with increasing limits. It returns one of three different types of values: either a solution node; or failure, when it has exhausted all nodes and proved there is no solution at any depth; or cutoff , to mean there might be a solution at a deeper depth than l. This is a tree-like search algorithm that does not keep track of reached states, and thus uses much less memory than best-first search, but runs the risk of visiting the same state multiple times on different paths. Also, if the IS-CYCLE check does not check all cycles, then the algorithm may get caught in a loop.

6 Chapter 3 Solving Problems by Searching
function BIBF-SEARCH(problemF , fF , problemB, fB) returns a solution node, or failure nodeF ← NODE(problemF .INITIAL) // Node for a start state
nodeB ←NODE(problemB.INITIAL) //Node for a goal state frontierF ←apriorityqueueorderedbyfF,withnodeF asanelement
frontier B ← a priority queue ordered by fB , with node B as an element reachedF ← a lookup table, with one key nodeF .STATE and value nodeF reachedB ←a lookup table, with one key nodeB.STATE and value nodeB solution ← failure
while not TERMINATED(solution, frontierF , frontierB) do if fF (TOP(frontierF )) < fB(TOP(frontierB)) then solution←PROCEED(F,problemF frontierF,reachedF,reachedB,solution) else solution ← PROCEED(B, problemB, frontierB, reachedB, reachedF , solution) return solution function PROCEED(dir, problem, frontier, reached, reached2, solution) returns a solution // Expand node on frontier; check against the other frontier in reached 2 . // The variable “dir” is the direction: either F for forward or B for backward. node ← POP(frontier) for each child in EXPAND(problem, node) do s ←child.STATE if s not in reached or PATH-COST(child) < PATH-COST(reached[s]) then reached[s]←child add child to frontier if s is in reached2 then solution2 ← JOIN-NODES(dir, child, reached2[s])) if PATH-COST(solution2) < PATH-COST(solution) then solution ← solution2 return solution Figure 3.14 Bidirectional best-first search keeps two frontiers and two tables of reached states. When a path in one frontier reaches a state that was also reached in the other half of the search, the two paths are joined (by the function JOIN-NODES) to form a solution. The first solution we get is not guaranteed to be the best; the function TERMINATED determines when to stop looking for new solutions. 7 function RECURSIVE-BEST-FIRST-SEARCH(problem) returns a solution or failure solution, fvalue ← RBFS(problem, NODE(problem.INITIAL), ∞) return solution function RBFS(problem,node,f limit) returns a solution or failure, and a new f-cost limit if problem.IS-GOAL(node.STATE) then return node successors ← LIST(EXPAND(node)) if successors is empty then return failure, ∞ for each s in successors do // update f with value from previous search s.f ← max(s.PATH-COST + h(s), node.f )) while true do best ← the node in successors with lowest f -value if best.f > f limit then return failure, best.f
alternative ← the second-lowest f -value among successors result,best.f ←RBFS(problem,best,min(f limit,alternative)) if result ̸= failure then return result, best.f
Figure 3.22 The algorithm for recursive best-first search.

CHAPTER 4 SEARCH IN COMPLEX
ENVIRONMENTS
function HILL-CLIMBING(problem) returns a state that is a local maximum current ← problem.INITIAL
whiletrue do
neighbor ← a highest-valued successor state of current
if VALUE(neighbor) ≤ VALUE(current) then return current current ← neighbor
Figure 4.2 The hill-climbing search algorithm, which is the most basic local search tech- nique. At each step the current node is replaced by the best neighbor.
function SIMULATED-ANNEALING(problem,schedule) returns a solution state current ← problem.INITIAL
fort =1to∞do
T ←schedule(t)
if T = 0 then return current
next ← a randomly selected successor of current ∆E ← VALUE(current) – VALUE(next)
if ∆E > 0 then current ← next
else current ← next only with probability e−∆E/T
Figure 4.4 The simulated annealing algorithm, a version of stochastic hill climbing where some downhill moves are allowed. The schedule input determines the value of the “temper- ature” T as a function of time.

9
function GENETIC-ALGORITHM(population,fitness) returns an individual repeat
weights ← WEIGHTED-BY(population, fitness) population2 ← empty list
for i = 1 to SIZE(population) do
parent1, parent2 ←WEIGHTED-RANDOM-CHOICES(population,weights,2) child ← REPRODUCE(parent1 , parent2 )
if (small random probability) then child ← MUTATE(child)
population ← population2
until some individual is fit enough, or enough time has elapsed return the best individual in population, according to fitness
function REPRODUCE(parent1 , parent2 ) returns an individual
n ← LENGTH(parent1)
c ← random number from 1 to n
return APPEND(SUBSTRING(parent1,1,c),SUBSTRING(parent2,c +1,n))
Figure 4.7 A genetic algorithm. Within the function, population is an ordered list of indi- viduals, weights is a list of corresponding fitness values for each individual, and fitness is a function to compute these values.
function AND-OR-SEARCH(problem) returns a conditional plan, or failure return OR-SEARCH(problem,problem.INITIAL,[])
function OR-SEARCH(problem, state, path) returns a conditional plan, or failure if problem.IS-GOAL(state) then return the empty plan
if IS-CYCLE(path) then return failure
for each action in problem.ACTIONS(state) do
plan←AND-SEARCH(problem,RESULTS(state,action),[state] + path])
if plan ̸= failure then return [action] + plan] return failure
function AND-SEARCH(problem, states, path) returns a conditional plan, or failure for each si in states do
plani ← OR-SEARCH(problem, si, path)
if plani = failure then return failure
return [if s1 then plan1 else if s2 then plan2 else . . . if sn−1 then plann−1 else plann]
Figure 4.10 An algorithm for searching AND–OR graphs generated by nondeterministic en- vironments. A solution is a conditional plan that considers every nondeterministic outcome and makes a plan for each one.

10 Chapter 4 Search in Complex Environments
function ONLINE-DFS-AGENT(problem, s′) returns an action s, a, the previous state and action, initially null
persistent: result, a table mapping (s, a) to s′, initially empty
untried, a table mapping s to a list of untried actions
unbacktracked , a table mapping s to a list of states never backtracked to
if problem.IS-GOAL(s′) then return stop
if s′ is a new state (not in untried) then untried[s′]←problem.ACTIONS(s′) if s is not null then
result[s,a]←s′
add s to the front of unbacktracked[s′] if untried[s′] is empty then
if unbacktracked[s′] is empty then return stop
else a ←an action b such that result[s′,b] = POP(unbacktracked[s′]) else a ← POP(untried[s′])
s ←s′
return a
Figure 4.20 An online search agent that uses depth-first exploration. The agent can safely explore only in state spaces in which every action can be “undone” by some other action.
function LRTA*-AGENT(problem, s′, h) returns an action s, a, the previous state and action, initially null
persistent: result, a table mapping (s, a) to s′, initially empty
H , a table mapping s to a cost estimate, initially empty
if IS-GOAL(s′) then return stop
if s′ is a new state (not in H) then H[s′]←h(s′) if s is not null then
result[s,a]←s′
H [s] ← min LRTA*-COST(s, b, result[s, b], H )
b∈ACTIONS(s)
a ← argmin LRTA*-COST(problem, s′,b,result[s′,b],H)
b∈ACTIONS(s) s ←s′
return a
function LRTA*-COST(problem,s,a,s′,H) returns a cost estimate if s′ is undefined then return h(s)
else return problem.ACTION-COST(s, a, s′) + H[s′]
Figure 4.23 LRTA∗-AGENT selects an action according to the values of neighboring states, which are updated as the agent moves about the state space.

CHAPTER 5
function MINIMAX-SEARCH(game, state) returns an action player ← game.TO-MOVE(state)
value, move ← MAX-VALUE(game, state)
return move
function MAX-VALUE(game, state) returns a (utility, move) pair
if game.IS-TERMINAL(state) then return game.UTILITY(state, player), null v ← −∞
for each a in game.ACTIONS(state) do
v2 , a2 ← MIN-VALUE(game, game.RESULT(state, a)) ifv2 >v then
v , move ← v2 , a return v, move
function MIN-VALUE(game, state) returns a (utility, move) pair
if game.IS-TERMINAL(state) then return game.UTILITY(state, player), null v ← +∞
for each a in game.ACTIONS(state) do
v2 , a2 ← MAX-VALUE(game, game.RESULT(state, a)) ifv2 v then
v, move ←v2, a
α←MAX(α, v)
ifv ≥ βthenreturnv,move
return v, move
function MIN-VALUE(game, state,α,β) returns a (utility, move) pair
if game.IS-TERMINAL(state) then return game.UTILITY(state, player), null v ← +∞
for each a in game.ACTIONS(state) do
f ← FORWARD(f, et−d)
remove et−d−1 from the beginning of et−d:t
Ot−d ← diagonal matrix containing P(et−d | Xt−d )
B ← O−1 T−1BTOt t−d
else B ← BTOt
t←t+1
if t > d + 1 then return NORMALIZE(f × B1) else return null
Figure 14.6 An algorithm for smoothing with a fixed time lag of d steps, implemented as an online algorithm that outputs the new smoothed estimate given the observation for a new time step. Notice that the final output NORMALIZE(f × B1) is just α f × b, by Equation (??).
function PARTICLE-FILTERING(e, N , dbn) returns a set of samples for the next time step inputs: e, the new incoming evidence
N , the number of samples to be maintained
dbn, a DBN defined by P(X0), P(X1 | X0), and P(E1 | X1) persistent: S, a vector of samples of size N, initially generated from P(X0) local variables: W , a vector of weights of size N
37
for i = 1 to N do
S[i]←sample from P(X1 |X0 = S[i]) //step 1 W[i]←P(e|X1 = S[i]) //step 2
S ← WEIGHTED-SAMPLE-WITH-REPLACEMENT(N , S , W ) return S
// step 3
Figure 14.17 The particle filtering algorithm implemented as a recursive update oper- ation with state (the set of samples). Each of the sampling operations involves sam- pling the relevant slice variables in topological order, much as in PRIOR-SAMPLE. The WEIGHTED-SAMPLE-WITH-REPLACEMENT operation can be implemented to run in O(N) expected time. The step numbers refer to the description in the text.

15
PROBABILISTIC PROGRAMMING
CHAPTER
type Researcher, Paper, Citation random String Name(Researcher) random String Title(Paper)
random Paper PubCited(Citation) random String Text(Citation)
random Boolean Professor(Researcher) origin Researcher Author(Paper)
#Researcher ∼ OM(3,1)
Name(r) ∼ NamePrior()
Professor(r) ∼ Boolean(0.2)
#Paper (Author = r) ∼ if Professor (r) then OM (1.5, 0.5) else OM (1, 0.5) Title(p) ∼ PaperTitlePrior()
CitedPaper(c) ∼ UniformChoice({Paper p})
Text(c) ∼ HMMGrammar(Name(Author(CitedPaper(c))),Title(CitedPaper(c)))
Figure15.5 AnOUPMforcitationinformationextraction.Forsimplicitythemodelassumes one author per paper and omits details of the grammar and error models.

39
#SeismicEvents ∼ Poisson(T ∗ λe)
Time(e) ∼ UniformReal(0,T)
EarthQuake(e) ∼ Boolean(0.999)
Location(e) ∼ if Earthquake(e) then SpatialPrior() else UniformEarth() Depth (e) ∼ if Earthquake (e) then UniformReal (0, 700) else Exactly (0) Magnitude(e) ∼ Exponential(log(10))
Detected(e,p,s) ∼ Logistic(weights(s,p),Magnitude(e), Depth(e), Dist(e,s)) #Detections(site = s) ∼ Poisson(T ∗ λf (s))
#Detections(event=e, phase=p, station=s) = ifDetected(e,p,s)then1else0 OnsetTime(a,s) if (event(a) = null) then ∼ UniformReal(0,T)
else=Time(event(a)) + GeoTT(Dist(event(a),s),Depth(event(a)),phase(a)) + Laplace(μt(s), σt(s))
Amplitude(a,s) if (event(a) = null) then ∼ NoiseAmpModel(s)
else = AmpModel (Magnitude (event (a)), Dist (event (a), s), Depth (event (a)), phase (a))
Azimuth(a, s) if (event(a) = null) then ∼ UniformReal(0, 360)
else = GeoAzimuth(Location(event(a)),Depth(event(a)),phase(a),Site(s))
+ Laplace (0, σa (s)) Slowness(a,s)if(event(a)=null)then ∼ UniformReal(0,20)
else = GeoSlowness(Location(event(a)),Depth(event(a)),phase(a),Site(s)) + Laplace(0,σs(s))
ObservedPhase(a,s) ∼ CategoricalPhaseModel(phase(a)) Figure 15.6 A simplified version of the NET-VISA model (see text).
#Aircraft(EntryTime =t) ∼ Poisson(λa)
Exits(a,t) ∼ if InFlight(a,t) then Boolean(αe)
InFlight(a,t) = (t=EntryTime(a)) ∨ (InFlight(a,t−1) ∧ ¬Exits(a,t−1)) X(a,t) ∼ ift = EntryTime(a)thenInitX()
else if InFlight (a, t) then N (F X (a, t − 1), Σx )
#Blip(Source=a, Time=t) ∼ if InFlight(a,t) then Bernoulli(DetectionProb(X(a,t))) #Blip(Time=t) ∼ Poisson(λf )
Z(b) ∼ if Source(b)=null then UniformZ(R) else N(HX(Source(b),Time(b)),Σz)
Figure 15.9 An OUPM for radar tracking of multiple targets with false alarms, detection failure, and entry and exit of aircraft. The rate at which new aircraft enter the scene is λa, while the probability per time step that an aircraft exits the scene is αe. False alarm blips (i.e., ones not produced by an aircraft) appear uniformly in space at a rate of λf per time step. The probability that an aircraft is detected (i.e., produces a blip) depends on its current position.

40 Chapter 15 Probabilistic Programming
function GENERATE-IMAGE() returns an image with some letters letters ← GENERATE-LETTERS(10)
return RENDER-NOISY-IMAGE(letters, 32, 128)
function GENERATE-LETTERS(λ) returns a vector of letters n ∼ Poisson(λ)
letters ← [ ]
for i = 1 to n do
letters[i] ∼ UniformChoice({a,b,c,···}) return letters
function RENDER-NOISY-IMAGE(letters,width,height) returns a noisy image of the letters clean image←RENDER(letters,width,height,text top=10,text left=10)
noisy image←[]
noise variance ∼ UniformReal(0.1, 1)
for row = 1 to width do for col = 1 to height do
noisy image[row,col] ∼ N(clean image[row,col],noise variance) return noisy image
Figure15.11 Generativeprogramforanopen-universeprobabilitymodelforopticalcharac- ter recognition. The generative program produces degraded images containing sequences of letters by generating each sequence, rendering it into a 2D image, and incorporating additive noise at each pixel.
function GENERATE-MARKOV-LETTERS(λ) returns a vector of letters n ∼ Poisson(λ)
letters ← [ ]
letter probs ← MARKOV-INITIAL()
for i = 1 to n do
letters[i] ∼ Categorical(letter probs)
letter probs ← MARKOV-TRANSITION(letters[i])
return letters
Figure 15.15 Generative program for an improved optical character recognition model that generates letters according to a letter bigram model whose pairwise letter frequencies are estimated from a list of English words.

16
MAKING SIMPLE DECISIONS
function INFORMATION-GATHERING-AGENT(percept) returns an action persistent: D, a decision network
integrate percept into D
j ← the value that maximizes VPI (Ej ) / C (Ej ) if VPI(Ej) > C(Ej)
then return Request(Ej) else return the best action from D
Figure 16.9 Design of a simple, myopic information-gathering agent. The agent works by repeatedly selecting the observation with the highest information value, until the cost of the next observation is greater than its expected benefit.
CHAPTER

17
MAKING COMPLEX DECISIONS
function VALUE-ITERATION(mdp, ǫ) returns a utility function
inputs: mdp , an MDP with states S , actions A(s), transition model P (s′ | s, a),
rewards R(s, a, s′), discount γ
ǫ, the maximum error allowed in the utility of any state
local variables: U , U ′, vectors of utilities for states in S , initially zero
δ, the maximum relative change in the utility of any state
repeat
U ← U ′; δ ← 0
for each state s in S do
U ′[s] ← maxa ∈ A(s) Q-VALUE(mdp, s, a, U )
if|U′[s] − U[s]| > δthenδ←|U′[s] − U[s]| untilδ ≤ ǫ(1−γ)/γ
return U
Figure 17.6 The value iteration algorithm for calculating utilities of states. The termination
condition is from Equation (??).
function POLICY-ITERATION(mdp) returns a policy
inputs: mdp , an MDP with states S , actions A(s), transition model P (s′ | s, a) local variables: U , a vector of utilities for states in S , initially zero
π, a policy vector indexed by state, initially random
repeat
U ← POLICY-EVALUATION(π, U , mdp) unchanged ? ← true
for each state s in S do
a∗ ← argmax Q-VALUE(mdp, s, a, U ) a∈A(s)
ifQ-VALUE(mdp,s,a∗,U) > Q-VALUE(mdp,s,π[s],U)then π[s]←a∗; unchanged?←false
until unchanged? return π
Figure 17.9 The policy iteration algorithm for calculating an optimal policy.
CHAPTER

function POMDP-VALUE-ITERATION(pomdp, ǫ) returns a utility function
inputs: pomdp , a POMDP with states S , actions A(s), transition model P (s′ | s, a),
sensor model P (e | s), rewards R(s), discount γ
ǫ, the maximum error allowed in the utility of any state
local variables: U , U ′, sets of plans p with associated utility vectors αp
U ′ ← a set containing just the empty plan [ ], with α[ ](s) = R(s) repeat
U ←U′
U ′ ← the set of all plans consisting of an action and, for each possible next percept,
a plan in U with utility vectors computed according to Equation (??) U ′ ← REMOVE-DOMINATED-PLANS(U ′)
untilMAX-DIFFERENCE(U,U′) ≤ ǫ(1−γ)/γ return U
Figure 17.16 A high-level sketch of the value iteration algorithm for POMDPs. The REMOVE-DOMINATED-PLANS step and MAX-DIFFERENCE test are typically implemented as linear programs.
43

18
MULTIAGENT DECISION MAKING
Actors(A,B)
Init(At(A,LeftBaseline) ∧ At(B,RightNet) ∧
Approaching(Ball,RightBaseline) ∧ Partner(A,B) ∧ Partner(B,A) Goal(Returned(Ball) ∧ (At(x,RightNet) ∨ At(x,LeftNet)) Action(Hit(actor,Ball),
PRECOND:Approaching(Ball,loc) ∧ At(actor,loc)
EFFECT:Returned(Ball)) Action(Go(actor,to),
PRECOND:At(actor,loc) ∧ to ̸= loc, EFFECT:At(actor,to) ∧ ¬ At(actor,loc))
Figure 18.1 The doubles tennis problem. Two actors, A and B, are playing together and can be in one of four locations: LeftBaseline, RightBaseline, LeftNet, and RightNet. The ball can be returned only if a player is in the right place. The NoOp action is a dummy, which has no effect. Note that each action must include the actor as an argument.
CHAPTER

19
LEARNING FROM EXAMPLES
function LEARN-DECISION-TREE(examples, attributes, parent examples) returns a tree
if examples is empty then return PLURALITY-VALUE(parent examples) else if all examples have the same classification then return the classification else if attributes is empty then return PLURALITY-VALUE(examples)
else
A ← argmaxa ∈ attributes IMPORTANCE(a, examples) tree ← a new decision tree with root test A
for each value v of A do
exs ←{e : e∈examples and e.A = v}
subtree ← LEARN-DECISION-TREE(exs, attributes − A, examples) add a branch to tree with label (A = v ) and subtree subtree
return tree
Figure19.5 Thedecisiontreelearningalgorithm.ThefunctionIMPORTANCEisdescribedin Section ??. The function PLURALITY-VALUE selects the most common output value among a set of examples, breaking ties randomly.
CHAPTER

46 Chapter 19 Learning from Examples
function MODEL-SELECTION(Learner,examples,k) returns a (hypothesis, error rate) pair
err ← an array, indexed by size, storing validation-set error rates training set , test set ← a partition of examples into two sets for size = 1 to ∞ do
err[size]←CROSS-VALIDATION(Learner,size,training set,k) if err is starting to increase significantly then
best size←the value of size with minimum err[size] h←Learner(best size,training set)
return h, ERROR-RATE(h, test set)
function CROSS-VALIDATION(Learner,size,examples,k) returns error rate
N ← the number of examples errs ← 0
for i = 1 to k do
validation set←examples[(i − 1) × N/k:i × N/k] training set ← examples − validation set h←Learner(size,training set)
errs ←errs + ERROR-RATE(h,validation set)
return errs / k // average error rate on validation sets, across k-fold cross-validation
Figure 19.8 An algorithm to select the model that has the lowest validation error. It builds models of increasing complexity, and choosing the one with best empirical error rate, err, on the validation data set. Learner(size,examples) returns a hypothesis whose complexity is set by the parameter size, and which is trained on examples. In CROSS-VALIDATION, each iteration of the for loop selects a different slice of the examples as the validation set, and keeps the other examples as the training set. It then returns the average validation set error over all the folds. Once we have determined which value of the size parameter is best, MODEL-SELECTION returns the model (i.e., learner/hypothesis) of that size, trained on all the training examples, along with its error rate on the held-out test examples.
function DECISION-LIST-LEARNING(examples) returns a decision list, or failure
if examples is empty then return the trivial decision list No
t ← a test that matches a nonempty subset examples t of examples
such that the members of examplest are all positive or all negative
if there is no such t then return failure
if the examples in examples t are positive then o ← Yes else o ← No
return a decision list with initial test t and outcome o and remaining tests given by
DECISION-LIST-LEARNING(examples − examplest) Figure 19.11 An algorithm for learning decision lists.

function ADABOOST(examples, L, K ) returns a hypothesis
inputs: examples, set of N labeled examples (x1, y1), . . . , (xN , yN )
L, a learning algorithm
K , the number of hypotheses in the ensemble
local variables: w, a vector of N example weights, initially all 1/N
h, a vector of K hypotheses
z, a vector of K hypothesis weights
ǫ ← a small positive number, used to avoid division by zero for k = 1 to K do
h[k ] ← L(examples , w)
error ← 0
for j = 1 to N do //Compute the total error for h[k]
if h[k](xj) ̸= yj then error ←error + w[j]
if error > 1/2 then break from loop
error ← min(error , 1 − ǫ)
for j = 1 to N do // Give more weight to the examples h[k ] got wrong
ifh[k](xj)=yj thenw[j]←w[j]· error/(1−error)
w ← NORMALIZE(w)
z[k ] ← 1 log ((1 − error )/error ) // Give more weight to accurate h[k ] 2􏲹
return Function(x) : zi hi(x)
Figure 19.25 The ADABOOST variant of the boosting method for ensemble learning. The algorithm generates hypotheses by successively reweighting the training examples. The func- tion WEIGHTED-MAJORITY generates a hypothesis that returns the output value with the highest vote from the hypotheses in h, with votes weighte􏲹d by z. For regression problems, or for binary classification with two classes -1 and 1, this is k h[k]z[k].
47

20
LEARNING PROBABILISTIC MODELS
CHAPTER

21
DEEP LEARNING
CHAPTER

22
REINFORCEMENT LEARNING
inputs: percept, a percept indicating the current state s′ and reward signal r persistent: π, a fixed policy
mdp, an MDP with model P, rewards R, actions A, discount γ
U , a table of utilities for states, initially empty
N s′ |s,a , a table of outcome count vectors indexed by state and action, initially zero s, a, the previous state and action, initially null
ifs′ isnewthenU[s′]←0 if s is not null then
increment N s′ |s,a [s , a ][s ’]
R[s, a, s′] ← r
P(· | s, a) ← NORMALIZE(N s′|s,a[s, a]) U ←POLICYEVALUATION(π,U,mdp) s,a ←s′,π[s′]
return a
Figure 22.2 A passive reinforcement learning agent based on adaptive dynamic program- ming. The agent chooses a value for γ and then incrementally computes the P and R values of the MDP. The POLICY-EVALUATION function solves the fixed-policy Bellman equations, as described on page ??.
CHAPTER

51
function PASSIVE-TD-LEARNER(percept) returns an action
inputs: percept, a percept indicating the current state s′ and reward signal r persistent: π, a fixed policy
s, the previous state, initially null
U , a table of utilities for states, initially empty Ns, a table of frequencies for states, initially zero
if s′ is new then U[s′]←0 if s is not null then
increment N s [s ]
U[s]←U[s]+α(Ns[s])×(r +γU[s′]-U[s]) s ←s′
return π[s′]
Figure 22.4 A passive reinforcement learning agent that learns utility estimates using tem-
poral differences. The step-size function α(n) is chosen to ensure convergence.
function Q-LEARNING-AGENT(percept) returns an action
inputs: percept, a percept indicating the current state s′ and reward signal r persistent: Q , a table of action values indexed by state and action, initially zero
Nsa , a table of frequencies for state–action pairs, initially zero s, a, the previous state and action, initially null
if s is not null then
increment Nsa [s , a ]
Q[s,a]←Q[s,a] + α(Nsa[s,a])(r + γ maxa′ Q[s′,a′] − Q[s,a])
s,a ←s′,argmaxa′ f(Q[s′,a′],Nsa[s′,a′]) return a
Figure 22.8 An exploratory Q-learning agent. It is an active learner that learns the value Q(s, a) of each action in each situation. It uses the same exploration function f as the ex- ploratory ADP agent, but avoids having to learn the transition model.

23
NATURAL LANGUAGE PROCESSING
function CYK-PARSE(words,grammar) returns a table of parse trees inputs: words, a list of words
grammar, a structure with LEXICALRULES and GRAMMARRULES
T ←a table //T[X, i, k] is most probable X tree spanning wordsi:k
P ←a table, initially all 0 //P[X, i, k] is probability of tree T[X, i, k] // Insert lexical categories for each word.
for i = 1 to LEN(words) do
for each (X, p) in grammar.LEXICALRULES(wordsi) do P [X , i , i ] ← p
T[X, i, i]←TREE(X, wordsi)
//Construct Xi:k from Yi:j + Zj+1:k, shortest spans first. for each (i, j, k) in SUBSPANS(LEN(words)) do
for each (X, Y, Z, p) in grammar.GRAMMARRULES do PYZ ←P[Y, i, j] × P[Z, j +1, k] × p
ifPYZ > P[X, i, k]do
P [X , i , k ] ← PYZ
T[X,i,k]←TREE(X,T[Y,i,j],T[Z,j +1,k]) return T
function SUBSPANS(N) yields (i, j, k) tuples forlength=2toN do
fori =1toN + 1 − length do k←i + length − 1
for j = i to k − 1 do
yield (i, j, k)
Figure 23.5 The CYK algorithm for parsing. Given a sequence of words, it finds the most probable parse tree for the sequence and its subsequences. The table P[X,i,k] gives the probability of the most probable tree of category X spanning wordsi:k. The output table T[X, i, k] contains the most probable tree of category X spanning positions i to k inclu- sive. The function SUBSPANS returns all tuples (i,j,k) covering a span of wordsi:k, with i ≤ j < k, listing the tuples by increasing length of the i : k span, so that when we go to combine two shorter spans into a longer one, the shorter spans are already in the table. LEXICALRULES(word) returns a collection of (X, p) pairs, one for each rule of the form X →word[htbp],andGRAMMARRULESgives(X,Y,Z,p)tuples,oneforeachgrammar ruleoftheformX →Y Z [p]. CHAPTER .] [SBAR-ADV as if [S [NP she] [VP did n’t [VP [VP hear [NP *-1]] or [VP [ADVP even] see [NP *-1]] [NP-1 him]]]]]]]] Figure 23.8 Annotated tree for the sentence “Her eyes were glazed as if she didn’t hear or even see him.” from the Penn Treebank. Note a grammatical phenomenon we have not covered yet: the movement of a phrase from one part of the tree to another. This tree analyzes the phrase “hear or even see him” as consisting of two constituent VP s, [VP hear [NP *-1]] and [VP [ADVP even] see [NP *-1]], both of which have a missing object, denoted *-1, which refers to the NP labeled elsewhere in the tree as [NP-1 him]. Similarly, the [NP *-2] refers to the [NP-2 Her eyes]. 53 [ [S [NP-2 Her eyes] [VP were [VP glazed [NP *-2] 24 DEEP LEARNING FOR NATURAL LANGUAGE PROCESSING It is a truth universally acknowledged that the earth is not the center of the uni- verse. There are those who assert there is. I do not accept them, but others I consider to be of the same opinion. The truth is, however, that if there are other than the center, and if there are any other living things in the universe and if they are not human, then we do not yet have our answers. We have to go on. This page gives a simplified, simplified answer to the problem. We don’t have all the answers. The truth is, however, that the truth is out there. When Gregor Samsa woke up one morning, he did not notice anything strange. “When my wife is looking at me, I feel like she is looking at a piece of art,” he said. “I think she is admiring something I have created.” The idea is that by looking at your own life, you learn something important and become a better person. It is a theory that emerged from psychologist Daniel Goleman’s work, in which he asked “How do you know you’re not a loser?” Alice was beginning to get very tired of sitting with her sister on the bank. She sat up, yawned, and said, with a loud little scream, “I hope you don’t mind if I keep on doing what I should like to do, and if someone asks me which of us will do more, don’t tell them that I won’t do much, my dear sister.” All happy families are alike; each happy family is like a garden of paradise. The only difference between happy families and unhappy families, is that the unhappy family doesn’t have any flowers or trees. Tell me a story. Tell me a story. Tell me a story. Tell me a story. Tell me a story. Tell me a story. Tell me a story. Tell me a story. Tell me a story. Tell me a story. Tell me a story. Tell me a story. Please fill out the following details. Thank you... Thank you for your interest in this interview. Please wait... Figure 24.13 Example completion texts generated by the GPT-2 language model, given the prompts in bold. Most of the texts are quite fluent English, at least locally. The final example demonstrates that sometimes the model just breaks down. CHAPTER 25 COMPUTER VISION CHAPTER 26 ROBOTICS function MONTE-CARLO-LOCALIZATIONa, z, N, P(X′|X, v, ω), P(z|z∗), map returns a set of samples, S, for the next time step inputs: a, robot velocities v and ω z, a vector of M range scan data points P(X′|X, v, ω), motion model P (z|z∗), a range sensor noise model map, a 2D map of the environment persistent: S, a vector of N samples local variables: W , a vector of N weights S′, a temporary vector of N samples if S is empty then for i = 1 to N do // initialization phase S[i] ← sample from P (X0) for i = 1 to N do //update cycle S′[i]←sample from P(X′|X = S[i],v,ω) W[i]←1 for j = 1 to M do z∗ ← RAYCAST(j, X = S′[i], map) W[i]←W[i] · P(zj| z∗) S ← WEIGHTED-SAMPLE-WITH-REPLACEMENT(N , S′, W ) return S Figure 26.6 A Monte Carlo localization algorithm using a range-scan sensor model with independent noise. CHAPTER 27 PHILOSOPHY, ETHICS, AND SAFETY OF AI CHAPTER 28 THE FUTURE OF AI CHAPTER 29 MATHEMATICAL BACKGROUND CHAPTER 30 NOTES ON LANGUAGES AND ALGORITHMS CHAPTER

Posted in Uncategorized

# CS代考计算机代写 algorithm flex deep learning Bayesian network data structure Bayesian decision tree AI Hidden Markov Mode chain 1

1
INTRODUCTION
CHAPTER

CHAPTER 2 INTELLIGENT AGENTS
function TABLE-DRIVEN-AGENT(percept) returns an action persistent: percepts, a sequence, initially empty
table, a table of actions, indexed by percept sequences, initially fully specified
append percept to the end of percepts action ←LOOKUP(percepts,table) return action
Figure 2.7 The TABLE-DRIVEN-AGENT program is invoked for each new percept and re- turns an action each time. It retains the complete percept sequence in memory.
function REFLEX-VACUUM-AGENT([location,status]) returns an action
if status = Dirty then return Suck else if location = A then return Right else if location = B then return Left
Figure2.8 Theagentprogramforasimplereflexagentinthetwo-locationvacuumenviron- ment. This program implements the agent function tabulated in Figure ??.
function SIMPLE-REFLEX-AGENT(percept) returns an action persistent: rules, a set of condition–action rules
state ← INTERPRET-INPUT(percept) rule ← RULE-MATCH(state, rules) action ← rule.ACTION
return action
Figure 2.10 A simple reflex agent. It acts according to a rule whose condition matches the current state, as defined by the percept.

function MODEL-BASED-REFLEX-AGENT(percept) returns an action persistent: state, the agent’s current conception of the world state
transition model , a description of how the next state depends on the current state and action
sensor model , a description of how the current world state is reflected in the agent’s percepts
rules, a set of condition–action rules action, the most recent action, initially none
state←UPDATE-STATE(state,action,percept,transition model,sensor model) rule ← RULE-MATCH(state, rules)
action ← rule.ACTION
return action
Figure 2.12 A model-based reflex agent. It keeps track of the current state of the world, using an internal model. It then chooses an action in the same way as the reflex agent.
3

CHAPTER 3
SOLVING PROBLEMS BY SEARCHING
function BEST-FIRST-SEARCH(problem, f ) returns a solution node or failure
node ← NODE(STATE=problem.INITIAL)
frontier ← a priority queue ordered by f , with node as an element
reached ← a lookup table, with one entry with key problem.INITIAL and value node while not IS-EMPTY(frontier) do
node ← POP(frontier)
if problem.IS-GOAL(node.STATE) then return node for each child in EXPAND(problem, node) do
s ←child.STATE
if s is not in reached or child.PATH-COST < reached[s].PATH-COST then reached[s]←child add child to frontier return failure function EXPAND(problem,node) yields nodes s ← node.STATE for each action in problem.ACTIONS(s) do s′ ← problem.RESULT(s, action) cost ←node.PATH-COST + problem.ACTION-COST(s,action,s′) yield NODE(STATE=s′, PARENT=node, ACTION=action, PATH-COST=cost) Figure 3.7 The best-first search algorithm, and the function for expanding a node. The data structures used here are described in Section ??. See Appendix B for yield. 5 function BREADTH-FIRST-SEARCH(problem) returns a solution node or failure node ← NODE(problem.INITIAL) if problem.IS-GOAL(node.STATE) then return node frontier ← a FIFO queue, with node as an element reached ← {problem.INITIAL} while not IS-EMPTY(frontier) do node ← POP(frontier) for each child in EXPAND(problem, node) do s ←child.STATE if problem.IS-GOAL(s) then return child if s is not in reached then add s to reached add child to frontier return failure function UNIFORM-COST-SEARCH(problem) returns a solution node, or failure return BEST-FIRST-SEARCH(problem, PATH-COST) Figure 3.9 Breadth-first search and uniform-cost search algorithms. function ITERATIVE-DEEPENING-SEARCH(problem) returns a solution node or failure fordepth =0to∞do result ←DEPTH-LIMITED-SEARCH(problem,depth) if result ̸= cutoff then return result function DEPTH-LIMITED-SEARCH(problem, l) returns a node or failure or cutoff frontier ← a LIFO queue (stack) with NODE(problem.INITIAL) as an element result ← failure while not IS-EMPTY(frontier) do node ← POP(frontier) if problem.IS-GOAL(node.STATE) then return node if DEPTH(node) > l then
result ← cutoff
else if not IS-CYCLE(node) do
for each child in EXPAND(problem, node) do add child to frontier
return result
Figure 3.12 Iterative deepening and depth-limited tree-like search. Iterative deepening re- peatedly applies depth-limited search with increasing limits. It returns one of three different types of values: either a solution node; or failure, when it has exhausted all nodes and proved there is no solution at any depth; or cutoff , to mean there might be a solution at a deeper depth than l. This is a tree-like search algorithm that does not keep track of reached states, and thus uses much less memory than best-first search, but runs the risk of visiting the same state multiple times on different paths. Also, if the IS-CYCLE check does not check all cycles, then the algorithm may get caught in a loop.

6 Chapter 3 Solving Problems by Searching
function BIBF-SEARCH(problemF , fF , problemB, fB) returns a solution node, or failure nodeF ← NODE(problemF .INITIAL) // Node for a start state
nodeB ←NODE(problemB.INITIAL) //Node for a goal state frontierF ←apriorityqueueorderedbyfF,withnodeF asanelement
frontier B ← a priority queue ordered by fB , with node B as an element reachedF ← a lookup table, with one key nodeF .STATE and value nodeF reachedB ←a lookup table, with one key nodeB.STATE and value nodeB solution ← failure
while not TERMINATED(solution, frontierF , frontierB) do if fF (TOP(frontierF )) < fB(TOP(frontierB)) then solution←PROCEED(F,problemF frontierF,reachedF,reachedB,solution) else solution ← PROCEED(B, problemB, frontierB, reachedB, reachedF , solution) return solution function PROCEED(dir, problem, frontier, reached, reached2, solution) returns a solution // Expand node on frontier; check against the other frontier in reached 2 . // The variable “dir” is the direction: either F for forward or B for backward. node ← POP(frontier) for each child in EXPAND(problem, node) do s ←child.STATE if s not in reached or PATH-COST(child) < PATH-COST(reached[s]) then reached[s]←child add child to frontier if s is in reached2 then solution2 ← JOIN-NODES(dir, child, reached2[s])) if PATH-COST(solution2) < PATH-COST(solution) then solution ← solution2 return solution Figure 3.14 Bidirectional best-first search keeps two frontiers and two tables of reached states. When a path in one frontier reaches a state that was also reached in the other half of the search, the two paths are joined (by the function JOIN-NODES) to form a solution. The first solution we get is not guaranteed to be the best; the function TERMINATED determines when to stop looking for new solutions. 7 function RECURSIVE-BEST-FIRST-SEARCH(problem) returns a solution or failure solution, fvalue ← RBFS(problem, NODE(problem.INITIAL), ∞) return solution function RBFS(problem,node,f limit) returns a solution or failure, and a new f-cost limit if problem.IS-GOAL(node.STATE) then return node successors ← LIST(EXPAND(node)) if successors is empty then return failure, ∞ for each s in successors do // update f with value from previous search s.f ← max(s.PATH-COST + h(s), node.f )) while true do best ← the node in successors with lowest f -value if best.f > f limit then return failure, best.f
alternative ← the second-lowest f -value among successors result,best.f ←RBFS(problem,best,min(f limit,alternative)) if result ̸= failure then return result, best.f
Figure 3.22 The algorithm for recursive best-first search.

CHAPTER 4 SEARCH IN COMPLEX
ENVIRONMENTS
function HILL-CLIMBING(problem) returns a state that is a local maximum current ← problem.INITIAL
whiletrue do
neighbor ← a highest-valued successor state of current
if VALUE(neighbor) ≤ VALUE(current) then return current current ← neighbor
Figure 4.2 The hill-climbing search algorithm, which is the most basic local search tech- nique. At each step the current node is replaced by the best neighbor.
function SIMULATED-ANNEALING(problem,schedule) returns a solution state current ← problem.INITIAL
fort =1to∞do
T ←schedule(t)
if T = 0 then return current
next ← a randomly selected successor of current ∆E ← VALUE(current) – VALUE(next)
if ∆E > 0 then current ← next
else current ← next only with probability e−∆E/T
Figure 4.4 The simulated annealing algorithm, a version of stochastic hill climbing where some downhill moves are allowed. The schedule input determines the value of the “temper- ature” T as a function of time.

9
function GENETIC-ALGORITHM(population,fitness) returns an individual repeat
weights ← WEIGHTED-BY(population, fitness) population2 ← empty list
for i = 1 to SIZE(population) do
parent1, parent2 ←WEIGHTED-RANDOM-CHOICES(population,weights,2) child ← REPRODUCE(parent1 , parent2 )
if (small random probability) then child ← MUTATE(child)
population ← population2
until some individual is fit enough, or enough time has elapsed return the best individual in population, according to fitness
function REPRODUCE(parent1 , parent2 ) returns an individual
n ← LENGTH(parent1)
c ← random number from 1 to n
return APPEND(SUBSTRING(parent1,1,c),SUBSTRING(parent2,c +1,n))
Figure 4.7 A genetic algorithm. Within the function, population is an ordered list of indi- viduals, weights is a list of corresponding fitness values for each individual, and fitness is a function to compute these values.
function AND-OR-SEARCH(problem) returns a conditional plan, or failure return OR-SEARCH(problem,problem.INITIAL,[])
function OR-SEARCH(problem, state, path) returns a conditional plan, or failure if problem.IS-GOAL(state) then return the empty plan
if IS-CYCLE(path) then return failure
for each action in problem.ACTIONS(state) do
plan←AND-SEARCH(problem,RESULTS(state,action),[state] + path])
if plan ̸= failure then return [action] + plan] return failure
function AND-SEARCH(problem, states, path) returns a conditional plan, or failure for each si in states do
plani ← OR-SEARCH(problem, si, path)
if plani = failure then return failure
return [if s1 then plan1 else if s2 then plan2 else . . . if sn−1 then plann−1 else plann]
Figure 4.10 An algorithm for searching AND–OR graphs generated by nondeterministic en- vironments. A solution is a conditional plan that considers every nondeterministic outcome and makes a plan for each one.

10 Chapter 4 Search in Complex Environments
function ONLINE-DFS-AGENT(problem, s′) returns an action s, a, the previous state and action, initially null
persistent: result, a table mapping (s, a) to s′, initially empty
untried, a table mapping s to a list of untried actions
unbacktracked , a table mapping s to a list of states never backtracked to
if problem.IS-GOAL(s′) then return stop
if s′ is a new state (not in untried) then untried[s′]←problem.ACTIONS(s′) if s is not null then
result[s,a]←s′
add s to the front of unbacktracked[s′] if untried[s′] is empty then
if unbacktracked[s′] is empty then return stop
else a ←an action b such that result[s′,b] = POP(unbacktracked[s′]) else a ← POP(untried[s′])
s ←s′
return a
Figure 4.20 An online search agent that uses depth-first exploration. The agent can safely explore only in state spaces in which every action can be “undone” by some other action.
function LRTA*-AGENT(problem, s′, h) returns an action s, a, the previous state and action, initially null
persistent: result, a table mapping (s, a) to s′, initially empty
H , a table mapping s to a cost estimate, initially empty
if IS-GOAL(s′) then return stop
if s′ is a new state (not in H) then H[s′]←h(s′) if s is not null then
result[s,a]←s′
H [s] ← min LRTA*-COST(s, b, result[s, b], H )
b∈ACTIONS(s)
a ← argmin LRTA*-COST(problem, s′,b,result[s′,b],H)
b∈ACTIONS(s) s ←s′
return a
function LRTA*-COST(problem,s,a,s′,H) returns a cost estimate if s′ is undefined then return h(s)
else return problem.ACTION-COST(s, a, s′) + H[s′]
Figure 4.23 LRTA∗-AGENT selects an action according to the values of neighboring states, which are updated as the agent moves about the state space.

CHAPTER 5
function MINIMAX-SEARCH(game, state) returns an action player ← game.TO-MOVE(state)
value, move ← MAX-VALUE(game, state)
return move
function MAX-VALUE(game, state) returns a (utility, move) pair
if game.IS-TERMINAL(state) then return game.UTILITY(state, player), null v ← −∞
for each a in game.ACTIONS(state) do
v2 , a2 ← MIN-VALUE(game, game.RESULT(state, a)) ifv2 >v then
v , move ← v2 , a return v, move
function MIN-VALUE(game, state) returns a (utility, move) pair
if game.IS-TERMINAL(state) then return game.UTILITY(state, player), null v ← +∞
for each a in game.ACTIONS(state) do
v2 , a2 ← MAX-VALUE(game, game.RESULT(state, a)) ifv2 v then
v, move ←v2, a
α←MAX(α, v)
ifv ≥ βthenreturnv,move
return v, move
function MIN-VALUE(game, state,α,β) returns a (utility, move) pair
if game.IS-TERMINAL(state) then return game.UTILITY(state, player), null v ← +∞
for each a in game.ACTIONS(state) do
f ← FORWARD(f, et−d)
remove et−d−1 from the beginning of et−d:t
Ot−d ← diagonal matrix containing P(et−d | Xt−d )
B ← O−1 T−1BTOt t−d
else B ← BTOt
t←t+1
if t > d + 1 then return NORMALIZE(f × B1) else return null
Figure 14.6 An algorithm for smoothing with a fixed time lag of d steps, implemented as an online algorithm that outputs the new smoothed estimate given the observation for a new time step. Notice that the final output NORMALIZE(f × B1) is just α f × b, by Equation (??).
function PARTICLE-FILTERING(e, N , dbn) returns a set of samples for the next time step inputs: e, the new incoming evidence
N , the number of samples to be maintained
dbn, a DBN defined by P(X0), P(X1 | X0), and P(E1 | X1) persistent: S, a vector of samples of size N, initially generated from P(X0) local variables: W , a vector of weights of size N
37
for i = 1 to N do
S[i]←sample from P(X1 |X0 = S[i]) //step 1 W[i]←P(e|X1 = S[i]) //step 2
S ← WEIGHTED-SAMPLE-WITH-REPLACEMENT(N , S , W ) return S
// step 3
Figure 14.17 The particle filtering algorithm implemented as a recursive update oper- ation with state (the set of samples). Each of the sampling operations involves sam- pling the relevant slice variables in topological order, much as in PRIOR-SAMPLE. The WEIGHTED-SAMPLE-WITH-REPLACEMENT operation can be implemented to run in O(N) expected time. The step numbers refer to the description in the text.

15
PROBABILISTIC PROGRAMMING
CHAPTER
type Researcher, Paper, Citation random String Name(Researcher) random String Title(Paper)
random Paper PubCited(Citation) random String Text(Citation)
random Boolean Professor(Researcher) origin Researcher Author(Paper)
#Researcher ∼ OM(3,1)
Name(r) ∼ NamePrior()
Professor(r) ∼ Boolean(0.2)
#Paper (Author = r) ∼ if Professor (r) then OM (1.5, 0.5) else OM (1, 0.5) Title(p) ∼ PaperTitlePrior()
CitedPaper(c) ∼ UniformChoice({Paper p})
Text(c) ∼ HMMGrammar(Name(Author(CitedPaper(c))),Title(CitedPaper(c)))
Figure15.5 AnOUPMforcitationinformationextraction.Forsimplicitythemodelassumes one author per paper and omits details of the grammar and error models.

39
#SeismicEvents ∼ Poisson(T ∗ λe)
Time(e) ∼ UniformReal(0,T)
EarthQuake(e) ∼ Boolean(0.999)
Location(e) ∼ if Earthquake(e) then SpatialPrior() else UniformEarth() Depth (e) ∼ if Earthquake (e) then UniformReal (0, 700) else Exactly (0) Magnitude(e) ∼ Exponential(log(10))
Detected(e,p,s) ∼ Logistic(weights(s,p),Magnitude(e), Depth(e), Dist(e,s)) #Detections(site = s) ∼ Poisson(T ∗ λf (s))
#Detections(event=e, phase=p, station=s) = ifDetected(e,p,s)then1else0 OnsetTime(a,s) if (event(a) = null) then ∼ UniformReal(0,T)
else=Time(event(a)) + GeoTT(Dist(event(a),s),Depth(event(a)),phase(a)) + Laplace(μt(s), σt(s))
Amplitude(a,s) if (event(a) = null) then ∼ NoiseAmpModel(s)
else = AmpModel (Magnitude (event (a)), Dist (event (a), s), Depth (event (a)), phase (a))
Azimuth(a, s) if (event(a) = null) then ∼ UniformReal(0, 360)
else = GeoAzimuth(Location(event(a)),Depth(event(a)),phase(a),Site(s))
+ Laplace (0, σa (s)) Slowness(a,s)if(event(a)=null)then ∼ UniformReal(0,20)
else = GeoSlowness(Location(event(a)),Depth(event(a)),phase(a),Site(s)) + Laplace(0,σs(s))
ObservedPhase(a,s) ∼ CategoricalPhaseModel(phase(a)) Figure 15.6 A simplified version of the NET-VISA model (see text).
#Aircraft(EntryTime =t) ∼ Poisson(λa)
Exits(a,t) ∼ if InFlight(a,t) then Boolean(αe)
InFlight(a,t) = (t=EntryTime(a)) ∨ (InFlight(a,t−1) ∧ ¬Exits(a,t−1)) X(a,t) ∼ ift = EntryTime(a)thenInitX()
else if InFlight (a, t) then N (F X (a, t − 1), Σx )
#Blip(Source=a, Time=t) ∼ if InFlight(a,t) then Bernoulli(DetectionProb(X(a,t))) #Blip(Time=t) ∼ Poisson(λf )
Z(b) ∼ if Source(b)=null then UniformZ(R) else N(HX(Source(b),Time(b)),Σz)
Figure 15.9 An OUPM for radar tracking of multiple targets with false alarms, detection failure, and entry and exit of aircraft. The rate at which new aircraft enter the scene is λa, while the probability per time step that an aircraft exits the scene is αe. False alarm blips (i.e., ones not produced by an aircraft) appear uniformly in space at a rate of λf per time step. The probability that an aircraft is detected (i.e., produces a blip) depends on its current position.

40 Chapter 15 Probabilistic Programming
function GENERATE-IMAGE() returns an image with some letters letters ← GENERATE-LETTERS(10)
return RENDER-NOISY-IMAGE(letters, 32, 128)
function GENERATE-LETTERS(λ) returns a vector of letters n ∼ Poisson(λ)
letters ← [ ]
for i = 1 to n do
letters[i] ∼ UniformChoice({a,b,c,···}) return letters
function RENDER-NOISY-IMAGE(letters,width,height) returns a noisy image of the letters clean image←RENDER(letters,width,height,text top=10,text left=10)
noisy image←[]
noise variance ∼ UniformReal(0.1, 1)
for row = 1 to width do for col = 1 to height do
noisy image[row,col] ∼ N(clean image[row,col],noise variance) return noisy image
Figure15.11 Generativeprogramforanopen-universeprobabilitymodelforopticalcharac- ter recognition. The generative program produces degraded images containing sequences of letters by generating each sequence, rendering it into a 2D image, and incorporating additive noise at each pixel.
function GENERATE-MARKOV-LETTERS(λ) returns a vector of letters n ∼ Poisson(λ)
letters ← [ ]
letter probs ← MARKOV-INITIAL()
for i = 1 to n do
letters[i] ∼ Categorical(letter probs)
letter probs ← MARKOV-TRANSITION(letters[i])
return letters
Figure 15.15 Generative program for an improved optical character recognition model that generates letters according to a letter bigram model whose pairwise letter frequencies are estimated from a list of English words.

16
MAKING SIMPLE DECISIONS
function INFORMATION-GATHERING-AGENT(percept) returns an action persistent: D, a decision network
integrate percept into D
j ← the value that maximizes VPI (Ej ) / C (Ej ) if VPI(Ej) > C(Ej)
then return Request(Ej) else return the best action from D
Figure 16.9 Design of a simple, myopic information-gathering agent. The agent works by repeatedly selecting the observation with the highest information value, until the cost of the next observation is greater than its expected benefit.
CHAPTER

17
MAKING COMPLEX DECISIONS
function VALUE-ITERATION(mdp, ǫ) returns a utility function
inputs: mdp , an MDP with states S , actions A(s), transition model P (s′ | s, a),
rewards R(s, a, s′), discount γ
ǫ, the maximum error allowed in the utility of any state
local variables: U , U ′, vectors of utilities for states in S , initially zero
δ, the maximum relative change in the utility of any state
repeat
U ← U ′; δ ← 0
for each state s in S do
U ′[s] ← maxa ∈ A(s) Q-VALUE(mdp, s, a, U )
if|U′[s] − U[s]| > δthenδ←|U′[s] − U[s]| untilδ ≤ ǫ(1−γ)/γ
return U
Figure 17.6 The value iteration algorithm for calculating utilities of states. The termination
condition is from Equation (??).
function POLICY-ITERATION(mdp) returns a policy
inputs: mdp , an MDP with states S , actions A(s), transition model P (s′ | s, a) local variables: U , a vector of utilities for states in S , initially zero
π, a policy vector indexed by state, initially random
repeat
U ← POLICY-EVALUATION(π, U , mdp) unchanged ? ← true
for each state s in S do
a∗ ← argmax Q-VALUE(mdp, s, a, U ) a∈A(s)
ifQ-VALUE(mdp,s,a∗,U) > Q-VALUE(mdp,s,π[s],U)then π[s]←a∗; unchanged?←false
until unchanged? return π
Figure 17.9 The policy iteration algorithm for calculating an optimal policy.
CHAPTER

function POMDP-VALUE-ITERATION(pomdp, ǫ) returns a utility function
inputs: pomdp , a POMDP with states S , actions A(s), transition model P (s′ | s, a),
sensor model P (e | s), rewards R(s), discount γ
ǫ, the maximum error allowed in the utility of any state
local variables: U , U ′, sets of plans p with associated utility vectors αp
U ′ ← a set containing just the empty plan [ ], with α[ ](s) = R(s) repeat
U ←U′
U ′ ← the set of all plans consisting of an action and, for each possible next percept,
a plan in U with utility vectors computed according to Equation (??) U ′ ← REMOVE-DOMINATED-PLANS(U ′)
untilMAX-DIFFERENCE(U,U′) ≤ ǫ(1−γ)/γ return U
Figure 17.16 A high-level sketch of the value iteration algorithm for POMDPs. The REMOVE-DOMINATED-PLANS step and MAX-DIFFERENCE test are typically implemented as linear programs.
43

18
MULTIAGENT DECISION MAKING
Actors(A,B)
Init(At(A,LeftBaseline) ∧ At(B,RightNet) ∧
Approaching(Ball,RightBaseline) ∧ Partner(A,B) ∧ Partner(B,A) Goal(Returned(Ball) ∧ (At(x,RightNet) ∨ At(x,LeftNet)) Action(Hit(actor,Ball),
PRECOND:Approaching(Ball,loc) ∧ At(actor,loc)
EFFECT:Returned(Ball)) Action(Go(actor,to),
PRECOND:At(actor,loc) ∧ to ̸= loc, EFFECT:At(actor,to) ∧ ¬ At(actor,loc))
Figure 18.1 The doubles tennis problem. Two actors, A and B, are playing together and can be in one of four locations: LeftBaseline, RightBaseline, LeftNet, and RightNet. The ball can be returned only if a player is in the right place. The NoOp action is a dummy, which has no effect. Note that each action must include the actor as an argument.
CHAPTER

19
LEARNING FROM EXAMPLES
function LEARN-DECISION-TREE(examples, attributes, parent examples) returns a tree
if examples is empty then return PLURALITY-VALUE(parent examples) else if all examples have the same classification then return the classification else if attributes is empty then return PLURALITY-VALUE(examples)
else
A ← argmaxa ∈ attributes IMPORTANCE(a, examples) tree ← a new decision tree with root test A
for each value v of A do
exs ←{e : e∈examples and e.A = v}
subtree ← LEARN-DECISION-TREE(exs, attributes − A, examples) add a branch to tree with label (A = v ) and subtree subtree
return tree
Figure19.5 Thedecisiontreelearningalgorithm.ThefunctionIMPORTANCEisdescribedin Section ??. The function PLURALITY-VALUE selects the most common output value among a set of examples, breaking ties randomly.
CHAPTER

46 Chapter 19 Learning from Examples
function MODEL-SELECTION(Learner,examples,k) returns a (hypothesis, error rate) pair
err ← an array, indexed by size, storing validation-set error rates training set , test set ← a partition of examples into two sets for size = 1 to ∞ do
err[size]←CROSS-VALIDATION(Learner,size,training set,k) if err is starting to increase significantly then
best size←the value of size with minimum err[size] h←Learner(best size,training set)
return h, ERROR-RATE(h, test set)
function CROSS-VALIDATION(Learner,size,examples,k) returns error rate
N ← the number of examples errs ← 0
for i = 1 to k do
validation set←examples[(i − 1) × N/k:i × N/k] training set ← examples − validation set h←Learner(size,training set)
errs ←errs + ERROR-RATE(h,validation set)
return errs / k // average error rate on validation sets, across k-fold cross-validation
Figure 19.8 An algorithm to select the model that has the lowest validation error. It builds models of increasing complexity, and choosing the one with best empirical error rate, err, on the validation data set. Learner(size,examples) returns a hypothesis whose complexity is set by the parameter size, and which is trained on examples. In CROSS-VALIDATION, each iteration of the for loop selects a different slice of the examples as the validation set, and keeps the other examples as the training set. It then returns the average validation set error over all the folds. Once we have determined which value of the size parameter is best, MODEL-SELECTION returns the model (i.e., learner/hypothesis) of that size, trained on all the training examples, along with its error rate on the held-out test examples.
function DECISION-LIST-LEARNING(examples) returns a decision list, or failure
if examples is empty then return the trivial decision list No
t ← a test that matches a nonempty subset examples t of examples
such that the members of examplest are all positive or all negative
if there is no such t then return failure
if the examples in examples t are positive then o ← Yes else o ← No
return a decision list with initial test t and outcome o and remaining tests given by
DECISION-LIST-LEARNING(examples − examplest) Figure 19.11 An algorithm for learning decision lists.

function ADABOOST(examples, L, K ) returns a hypothesis
inputs: examples, set of N labeled examples (x1, y1), . . . , (xN , yN )
L, a learning algorithm
K , the number of hypotheses in the ensemble
local variables: w, a vector of N example weights, initially all 1/N
h, a vector of K hypotheses
z, a vector of K hypothesis weights
ǫ ← a small positive number, used to avoid division by zero for k = 1 to K do
h[k ] ← L(examples , w)
error ← 0
for j = 1 to N do //Compute the total error for h[k]
if h[k](xj) ̸= yj then error ←error + w[j]
if error > 1/2 then break from loop
error ← min(error , 1 − ǫ)
for j = 1 to N do // Give more weight to the examples h[k ] got wrong
ifh[k](xj)=yj thenw[j]←w[j]· error/(1−error)
w ← NORMALIZE(w)
z[k ] ← 1 log ((1 − error )/error ) // Give more weight to accurate h[k ] 2􏲹
return Function(x) : zi hi(x)
Figure 19.25 The ADABOOST variant of the boosting method for ensemble learning. The algorithm generates hypotheses by successively reweighting the training examples. The func- tion WEIGHTED-MAJORITY generates a hypothesis that returns the output value with the highest vote from the hypotheses in h, with votes weighte􏲹d by z. For regression problems, or for binary classification with two classes -1 and 1, this is k h[k]z[k].
47

20
LEARNING PROBABILISTIC MODELS
CHAPTER

21
DEEP LEARNING
CHAPTER

22
REINFORCEMENT LEARNING
inputs: percept, a percept indicating the current state s′ and reward signal r persistent: π, a fixed policy
mdp, an MDP with model P, rewards R, actions A, discount γ
U , a table of utilities for states, initially empty
N s′ |s,a , a table of outcome count vectors indexed by state and action, initially zero s, a, the previous state and action, initially null
ifs′ isnewthenU[s′]←0 if s is not null then
increment N s′ |s,a [s , a ][s ’]
R[s, a, s′] ← r
P(· | s, a) ← NORMALIZE(N s′|s,a[s, a]) U ←POLICYEVALUATION(π,U,mdp) s,a ←s′,π[s′]
return a
Figure 22.2 A passive reinforcement learning agent based on adaptive dynamic program- ming. The agent chooses a value for γ and then incrementally computes the P and R values of the MDP. The POLICY-EVALUATION function solves the fixed-policy Bellman equations, as described on page ??.
CHAPTER

51
function PASSIVE-TD-LEARNER(percept) returns an action
inputs: percept, a percept indicating the current state s′ and reward signal r persistent: π, a fixed policy
s, the previous state, initially null
U , a table of utilities for states, initially empty Ns, a table of frequencies for states, initially zero
if s′ is new then U[s′]←0 if s is not null then
increment N s [s ]
U[s]←U[s]+α(Ns[s])×(r +γU[s′]-U[s]) s ←s′
return π[s′]
Figure 22.4 A passive reinforcement learning agent that learns utility estimates using tem-
poral differences. The step-size function α(n) is chosen to ensure convergence.
function Q-LEARNING-AGENT(percept) returns an action
inputs: percept, a percept indicating the current state s′ and reward signal r persistent: Q , a table of action values indexed by state and action, initially zero
Nsa , a table of frequencies for state–action pairs, initially zero s, a, the previous state and action, initially null
if s is not null then
increment Nsa [s , a ]
Q[s,a]←Q[s,a] + α(Nsa[s,a])(r + γ maxa′ Q[s′,a′] − Q[s,a])
s,a ←s′,argmaxa′ f(Q[s′,a′],Nsa[s′,a′]) return a
Figure 22.8 An exploratory Q-learning agent. It is an active learner that learns the value Q(s, a) of each action in each situation. It uses the same exploration function f as the ex- ploratory ADP agent, but avoids having to learn the transition model.

23
NATURAL LANGUAGE PROCESSING
function CYK-PARSE(words,grammar) returns a table of parse trees inputs: words, a list of words
grammar, a structure with LEXICALRULES and GRAMMARRULES
T ←a table //T[X, i, k] is most probable X tree spanning wordsi:k
P ←a table, initially all 0 //P[X, i, k] is probability of tree T[X, i, k] // Insert lexical categories for each word.
for i = 1 to LEN(words) do
for each (X, p) in grammar.LEXICALRULES(wordsi) do P [X , i , i ] ← p
T[X, i, i]←TREE(X, wordsi)
//Construct Xi:k from Yi:j + Zj+1:k, shortest spans first. for each (i, j, k) in SUBSPANS(LEN(words)) do
for each (X, Y, Z, p) in grammar.GRAMMARRULES do PYZ ←P[Y, i, j] × P[Z, j +1, k] × p
ifPYZ > P[X, i, k]do
P [X , i , k ] ← PYZ
T[X,i,k]←TREE(X,T[Y,i,j],T[Z,j +1,k]) return T
function SUBSPANS(N) yields (i, j, k) tuples forlength=2toN do
fori =1toN + 1 − length do k←i + length − 1
for j = i to k − 1 do
yield (i, j, k)
Figure 23.5 The CYK algorithm for parsing. Given a sequence of words, it finds the most probable parse tree for the sequence and its subsequences. The table P[X,i,k] gives the probability of the most probable tree of category X spanning wordsi:k. The output table T[X, i, k] contains the most probable tree of category X spanning positions i to k inclu- sive. The function SUBSPANS returns all tuples (i,j,k) covering a span of wordsi:k, with i ≤ j < k, listing the tuples by increasing length of the i : k span, so that when we go to combine two shorter spans into a longer one, the shorter spans are already in the table. LEXICALRULES(word) returns a collection of (X, p) pairs, one for each rule of the form X →word[htbp],andGRAMMARRULESgives(X,Y,Z,p)tuples,oneforeachgrammar ruleoftheformX →Y Z [p]. CHAPTER .] [SBAR-ADV as if [S [NP she] [VP did n’t [VP [VP hear [NP *-1]] or [VP [ADVP even] see [NP *-1]] [NP-1 him]]]]]]]] Figure 23.8 Annotated tree for the sentence “Her eyes were glazed as if she didn’t hear or even see him.” from the Penn Treebank. Note a grammatical phenomenon we have not covered yet: the movement of a phrase from one part of the tree to another. This tree analyzes the phrase “hear or even see him” as consisting of two constituent VP s, [VP hear [NP *-1]] and [VP [ADVP even] see [NP *-1]], both of which have a missing object, denoted *-1, which refers to the NP labeled elsewhere in the tree as [NP-1 him]. Similarly, the [NP *-2] refers to the [NP-2 Her eyes]. 53 [ [S [NP-2 Her eyes] [VP were [VP glazed [NP *-2] 24 DEEP LEARNING FOR NATURAL LANGUAGE PROCESSING It is a truth universally acknowledged that the earth is not the center of the uni- verse. There are those who assert there is. I do not accept them, but others I consider to be of the same opinion. The truth is, however, that if there are other than the center, and if there are any other living things in the universe and if they are not human, then we do not yet have our answers. We have to go on. This page gives a simplified, simplified answer to the problem. We don’t have all the answers. The truth is, however, that the truth is out there. When Gregor Samsa woke up one morning, he did not notice anything strange. “When my wife is looking at me, I feel like she is looking at a piece of art,” he said. “I think she is admiring something I have created.” The idea is that by looking at your own life, you learn something important and become a better person. It is a theory that emerged from psychologist Daniel Goleman’s work, in which he asked “How do you know you’re not a loser?” Alice was beginning to get very tired of sitting with her sister on the bank. She sat up, yawned, and said, with a loud little scream, “I hope you don’t mind if I keep on doing what I should like to do, and if someone asks me which of us will do more, don’t tell them that I won’t do much, my dear sister.” All happy families are alike; each happy family is like a garden of paradise. The only difference between happy families and unhappy families, is that the unhappy family doesn’t have any flowers or trees. Tell me a story. Tell me a story. Tell me a story. Tell me a story. Tell me a story. Tell me a story. Tell me a story. Tell me a story. Tell me a story. Tell me a story. Tell me a story. Tell me a story. Please fill out the following details. Thank you... Thank you for your interest in this interview. Please wait... Figure 24.13 Example completion texts generated by the GPT-2 language model, given the prompts in bold. Most of the texts are quite fluent English, at least locally. The final example demonstrates that sometimes the model just breaks down. CHAPTER 25 COMPUTER VISION CHAPTER 26 ROBOTICS function MONTE-CARLO-LOCALIZATIONa, z, N, P(X′|X, v, ω), P(z|z∗), map returns a set of samples, S, for the next time step inputs: a, robot velocities v and ω z, a vector of M range scan data points P(X′|X, v, ω), motion model P (z|z∗), a range sensor noise model map, a 2D map of the environment persistent: S, a vector of N samples local variables: W , a vector of N weights S′, a temporary vector of N samples if S is empty then for i = 1 to N do // initialization phase S[i] ← sample from P (X0) for i = 1 to N do //update cycle S′[i]←sample from P(X′|X = S[i],v,ω) W[i]←1 for j = 1 to M do z∗ ← RAYCAST(j, X = S′[i], map) W[i]←W[i] · P(zj| z∗) S ← WEIGHTED-SAMPLE-WITH-REPLACEMENT(N , S′, W ) return S Figure 26.6 A Monte Carlo localization algorithm using a range-scan sensor model with independent noise. CHAPTER 27 PHILOSOPHY, ETHICS, AND SAFETY OF AI CHAPTER 28 THE FUTURE OF AI CHAPTER 29 MATHEMATICAL BACKGROUND CHAPTER 30 NOTES ON LANGUAGES AND ALGORITHMS CHAPTER

Posted in Uncategorized