CS代考计算机代写 algorithm CS 561a: Introduction to Artificial Intelligence

CS 561a: Introduction to Artificial Intelligence

CS 561, Sessions 6-7
1
This time: Outline
Game playing
The minimax algorithm
Resource limitations
alpha-beta pruning
Elements of chance

CS 561, Sessions 6-7
2
What kind of games?
Abstraction: To describe a game we must capture every relevant aspect of the game. Such as:
Chess
Tic-tac-toe
…
Accessible environments: Such games are characterized by perfect information

Search: game-playing then consists of a search through possible game positions

Unpredictable opponent: introduces uncertainty thus game-playing must deal with contingency problems

CS 561, Sessions 6-7
3
Searching for the next move
Complexity: many games have a huge search space
Chess: b = 35, m=100  nodes = 35 100
if each node takes about 1 ns to explore
then each move will take about 10 50 millennia
to calculate.

Resource (e.g., time, memory) limit: optimal solution not feasible/possible, thus must approximate

Pruning: makes the search more efficient by discarding portions of the search tree that cannot improve quality result.

Evaluation functions: heuristics to evaluate utility of a state without exhaustive search.

CS 561, Sessions 6-7
4
Two-player games
A game formulated as a search problem:

Initial state: ?
Operators: ?
Terminal state: ?
Utility function: ?

CS 561, Sessions 6-7
5
Two-player games
A game formulated as a search problem:

Initial state: board position and turn
Operators: definition of legal moves
Terminal state: conditions for when game is over
Utility function: a numeric value that describes the outcome of the
game. E.g., -1, 0, 1 for loss, draw, win.
(AKA payoff function)

CS 561, Sessions 6-7
6
Game vs. search problem

CS 561, Sessions 6-7
7
Example: Tic-Tac-Toe

CS 561, Sessions 6-7
8
Type of games

CS 561, Sessions 6-7
9
Type of games

CS 561, Sessions 6-7
10

The minimax algorithm
Perfect play for deterministic environments with perfect information
Basic idea: choose move with highest minimax value
= best achievable payoff against best play
Algorithm:
Generate game tree completely
Determine utility of each terminal state
Propagate the utility values upward in the three by applying MIN and MAX operators on the nodes in the current level
At the root node use minimax decision to select the move with the max (of the min) utility value

Steps 2 and 3 in the algorithm assume that the opponent will play perfectly.

CS 561, Sessions 6-7
11
Generate Game Tree

CS 561, Sessions 6-7
12
Generate Game Tree

CS 561, Sessions 6-7
13
Generate Game Tree

o x

x
o

x o

CS 561, Sessions 6-7
14
Generate Game Tree

o x

x
o

x o

1 ply

1 move

CS 561, Sessions 6-7
15

A subtree

win
lose
draw

x
x

o
o
o
x
x
x

o
o
o
x
x
x
x

o
o
o
x
x
x
x
x

o
o
o
x
x
x
x

o
o
o
x
x
o
o
o
o
o
o

x
x

o
o
o
x
x
o
x
x
x
x
x

o
o
o
x
x
o
x
x

o
o
o
x
x
o
x

x
x

o
o
o
x
x
o

CS 561, Sessions 6-7
16

What is a good move?

win
lose
draw

x
x

o
o
o
x
x
x

o
o
o
x
x
x
x

o
o
o
x
x
x
x
x

o
o
o
x
x
x
x

o
o
o
x
x
o
o
o
o
o
o

x
x

o
o
o
x
x
o
x
x
x
x
x

o
o
o
x
x
o
x
x

o
o
o
x
x
o
x

x
x

o
o
o
x
x
o

CS 561, Sessions 6-7
17
Minimax
3
8
12
4
6
14
2
5
2

Minimize opponent’s chance
Maximize your chance

CS 561, Sessions 6-7
18
Minimax
3
2
3
2
8
12
4
6
14
2
5
2

MIN
Minimize opponent’s chance
Maximize your chance

CS 561, Sessions 6-7
19
Minimax
3
3
2
3
2
8
12
4
6
14
2
5
2

MAX
MIN
Minimize opponent’s chance
Maximize your chance

CS 561, Sessions 6-7
20
Minimax
3
3
2
3
2
8
12
4
6
14
2
5
2

MAX
MIN
Minimize opponent’s chance
Maximize your chance

CS 561, Sessions 6-7
21
minimax = maximum of the minimum

1st ply
2nd ply

CS 561, Sessions 6-7
22
Minimax: Recursive implementation
Complete: ?
Optimal: ?
Time complexity: ?
Space complexity: ?

CS 561, Sessions 6-7
23
Minimax: Recursive implementation
Complete: Yes, for finite state-space
Optimal: Yes
Time complexity: O(bm)
Space complexity: O(bm) (= DFS
Does not keep all nodes in memory.)

CS 561, Sessions 6-7
24
1. Move evaluation without complete search
Complete search is too complex and impractical

Evaluation function: evaluates value of state using heuristics and cuts off search

New MINIMAX:
CUTOFF-TEST: cutoff test to replace the termination condition (e.g., deadline, depth-limit, etc.)
EVAL: evaluation function to replace utility function (e.g., number of chess pieces taken)

CS 561, Sessions 6-7
25
Evaluation functions
Weighted linear evaluation function: to combine n heuristics
f = w1f1 + w2f2 + … + wnfn

E.g, w’s could be the values of pieces (1 for prawn, 3 for bishop etc.)
f’s could be the number of type of pieces on the board

CS 561, Sessions 6-7
26
Note: exact values do not matter

CS 561, Sessions 6-7
27
Minimax with cutoff: viable algorithm?

Assume we have 100 seconds, evaluate 104 nodes/s; can evaluate 106 nodes/move

CS 561, Sessions 6-7
28
2. - pruning: search cutoff
Pruning: eliminating a branch of the search tree from consideration without exhaustive examination of each node

- pruning: the basic idea is to prune portions of the search tree that cannot improve the utility value of the max or min node, by just considering the values of nodes seen so far.

Does it work? Yes, in roughly cuts the branching factor from b to b resulting in double as far look-ahead than pure minimax

CS 561, Sessions 6-7
29
Demo
http://homepage.ufp.pt/jtorres/ensino/ia/alfabeta.html

CS 561, Sessions 6-7
30
- pruning: example

 6
6
MAX
6
12
8
MIN

CS 561, Sessions 6-7
31
- pruning: example

 6
6
MAX
6
12
8

2
 2
MIN

CS 561, Sessions 6-7
32
- pruning: example

 6
6
MAX
6
12
8

2
 2

5
 5

MIN

33
- pruning: example

 6
6
MAX
6
12
8

2
 2

5
 5

MIN
Selected move
Interactive demo:
https://www.yosenspace.com/posts/computer-science-game-trees.html

CS 561, Sessions 6-7
34
- pruning: general principle

Player
Player
Opponent
Opponent
m
n

v

If  > v then MAX will chose m so prune tree under n
Similar for  for MIN

CS 561, Sessions 6-7
35
Properties of -

CS 561, Sessions 6-7
36
The - algorithm

CS 561, Sessions 6-7
37
More on the - algorithm
Same basic idea as minimax, but prune (cut away) branches of the tree that we know will not contain the solution.

We know a branch will not contain a solution once we know a better outcome has already been discovered in a previously explored branch.

CS 561, Sessions 6-7
38
Remember: Minimax: Recursive implementation
Complete: Yes, for finite state-space
Optimal: Yes
Time complexity: O(bm)
Space complexity: O(bm) (= DFS
Does not keep all nodes in memory.)

CS 561, Sessions 6-7
39
The - algorithm

CS 561, Sessions 6-7
40
More on the - algorithm
Same basic idea as minimax, but prune (cut away) branches of the tree that we know will not contain the solution.

Because minimax is depth-first, let’s consider nodes along a given path in the tree. Then, as we go along this path, we keep track of:
 : Best choice so far for MAX
 : Best choice so far for MIN

CS 561, Sessions 6-7
41
The - algorithm

Note:  and  are both
Local variables. At the
Start of the algorithm,
We initialize them to
 = - and  = +

CS 561, Sessions 6-7
42

More on the - algorithm
…
MAX
MIN

 = -
 = +
5 10 6 2 8 7
Min-Value loops
over these
In Min-Value:
 = -
 = 5
 = -
 = 5
 = -
 = 5
Max-Value loops
over these

Return v=5

CS 561, Sessions 6-7
43

More on the - algorithm
…
MAX
MIN
MAX

5 10 6 2 8 7
In Max-Value:
 = -
 = 5
 = -
 = 5
 = -
 = 5
 = 5
 = +
Max-Value loops
over these

Return v=5

CS 561, Sessions 6-7
44
In Min-Value:

More on the - algorithm
…
MAX
MIN
MAX

5 10 6 2 8 7
 = -
 = 5
 = -
 = 5
 = -
 = 5
 = 5
 = +
 = 5
 = +
End loop and return 2
Min-Value loops
over these

CS 561, Sessions 6-7
45
In Max-Value:

More on the - algorithm
…
MAX
MIN
MAX

5 10 6 2 x x
 = -
 = 5
 = -
 = 5
 = -
 = 5
 = 5
 = +
 = 5
 = +
 = 5
 = +
Max-Value loops
over these

Return v=2

CS 561, Sessions 6-7
46
Another way to understand the algorithm

For a given node N,

 is the value of N to MAX
 is the value of N to MIN

CS 561, Sessions 6-7
47
Example

CS 561, Sessions 6-7
48
- algorithm: slight variant (from earlier version of textbook)

Is this wrong compared to latest version of textbook?

Please always use latest version of the algorithm as in 3rd or 4th edition of textbook.

CS 561, Sessions 6-7
49
Solution
NODE TYPE ALPHA BETA SCORE
A MAX -Inf Inf
B MIN -Inf Inf
C MAX -Inf Inf
D MIN -Inf Inf
E MAX 10 10 10
D MIN -Inf 10
F MAX 11 11 11
D MIN -Inf 10 10
C MAX 10 Inf
G MIN 10 Inf
H MAX 9 9 9
G MIN 10 9 9
C MAX 10 Inf 10
B MIN -Inf 10
J MAX -Inf 10
K MIN -Inf 10
L MAX 14 14 14
K MIN -Inf 10
M MAX 15 15 15
K MIN -Inf 10 10
…
NODE TYPE ALPHA BETA SCORE
…
J MAX 10 10 10
B MIN -Inf 10 10
A MAX 10 Inf
Q MIN 10 Inf
R MAX 10 Inf
S MIN 10 Inf
T MAX 15 15 15
S MIN 10 15
U MAX 2 2 2
S MIN 10 2 2
R MAX 10 Inf
V MIN 10 Inf
W MAX 4 4 4
V MIN 10 4 4
R MAX 10 Inf 10
Q MIN 10 10 10
A MAX 10 Inf 10

CS 561, Sessions 6-7
50
State-of-the-art for deterministic games

CS 561, Sessions 6-7
51
Nondeterministic games

CS 561, Sessions 6-7
52
Algorithm for nondeterministic games

CS 561, Sessions 6-7
53
Remember: Minimax algorithm

CS 561, Sessions 6-7
54

Nondeterministic games: the element of chance

3
?
0.5
0.5
8
17
8
?
CHANCE
?
expectimax and expectimin, expected values over all possible outcomes

CS 561, Sessions 6-7
55

Nondeterministic games: the element of chance

3
5
0.5
0.5
8
17
8
5
CHANCE
4 = 0.5*3 + 0.5*5
Expectimax
Expectimin

CS 561, Sessions 6-7
56
Evaluation functions: Exact values DO matter
Order-preserving transformations do not necessarily behave the same!

CS 561, Sessions 6-7
57
State-of-the-art for nondeterministic games

CS 561, Sessions 6-7
58
Summary

CS 561, Sessions 6-7
59
Exercise: Game Playing
(a) Compute the backed-up values computed by the minimax algorithm. Show your answer by writing values at the appropriate nodes in the above tree.

(b) Compute the backed-up values computed by the alpha-beta algorithm. What nodes will not be examined by the alpha-beta pruning algorithm?

A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
Y
X
2
3
8
5
7
6
0
1
5
2
8
4
2
10
Max
Max
Min
Min
Consider the following game tree in which the evaluation function values are shown below each leaf node. Assume that the root node corresponds to the maximizing player. Assume the search always visits children left-to-right.

/docProps/thumbnail.jpeg

Published by admin

Leave a Reply Cancel reply