AGTA Tutorial 7
Please attempt the question before your tutorial.
1. Consider the 6-state MDP depicted in Figure 1.
In this MDP, states s5 and s6 are controlled by the player (the con- troller), whereas the other states are controlled by “nature” (random).
Copyright By cscodehelp代写 加微信 cscodehelp
Suppose the player’s objective is to optimize the probability of reaching state s6 starting at each state si, i = 1,…,5.
Write the Bellman optimality equations for this MDP with this objec- tive.
Compute the optimal probabilities, p∗i , of reaching state s6 starting from state si, for i = 1, . . . , 5, and also compute an optimal strategy for the player.
2. Consider the atomic network congestion game, with three players, de- scribed by the directed graph in Figure 2.
In this game, every player i (for i = 1, 2, 3) needs to choose a directed path from the source s to the target t. Thus, every player i’s set of possible actions (i.e., its set of pure strategies) is the set of all possible directed paths from s to t.
Each edge e is labeled with a sequence of three numbers (c1,c2,c3). Given a profile π = (π1,π2,π3) of pure strategies (i.e., s-t-paths) for all three players, the cost to player i of each directed edge, e, that is contained in player i’s path πi, is ck, where k is the total number of players that have chosen edge e in their path. The total cost to player i, in the given profile π, is the sum of the costs of all the edges in its path πi from s to t. Each player of course wants to minimize its own total cost.
Compute a pure strategy in this atomic network congestion game.
s6 s4•s3 121•
Figure 1: A 6-state MDP.
s 0,3,6 v2 t
Figure 2: Atomic network congestion game
程序代写 CS代考 加微信: cscodehelp QQ: 2235208643 Email: firstname.lastname@example.org