程序代写 AGTA Tutorial 7 – cscodehelp代写

AGTA Tutorial 7
1. Consider the 6-state MDP depicted in Figure 1.
In this MDP, states s5 and s6 are controlled by the player (the con- troller), whereas the other states are controlled by “nature” (random).

Suppose the player’s objective is to optimize the probability of reaching state s6 starting at each state si, i = 1,…,5.
Write the Bellman optimality equations for this MDP with this objec- tive.
Compute the optimal probabilities, p∗i , of reaching state s6 starting from state si, for i = 1, . . . , 5, and also compute an optimal strategy for the player.
2. Consider the atomic network congestion game, with three players, de- scribed by the directed graph in Figure 2.
In this game, every player i (for i = 1, 2, 3) needs to choose a directed path from the source s to the target t. Thus, every player i’s set of possible actions (i.e., its set of pure strategies) is the set of all possible directed paths from s to t.
Each edge e is labeled with a sequence of three numbers (c1,c2,c3). Given a profile π = (π1,π2,π3) of pure strategies (i.e., s-t-paths) for all three players, the cost to player i of each directed edge, e, that is contained in player i’s path πi, is ck, where k is the total number of players that have chosen edge e in their path. The total cost to player i, in the given profile π, is the sum of the costs of all the edges in its path πi from s to t. Each player of course wants to minimize its own total cost.
Compute a pure strategy in this atomic network congestion game.

s6 s4•s3 121•
Figure 1: A 6-state MDP.
7,4,8 1,3,4
s 0,3,6 v2 t
Figure 2: Atomic network congestion game