# 程序代写 What to do? – cscodehelp代写

What to do?

(40 Acres and a Mule Filmworks/Universal Pictures)

⃝c -Trenn, King’s College London 2

What to do?

(mystorybook.com/books/42485)

⃝c -Trenn, King’s College London 3

What to do?

( & /Google )

⃝c -Trenn, King’s College London 4

Sequential decision making?

(mystorybook.com/books/42485)

Ultimately, we are interested in sequential decision making One decision leads to another.

Each decision depends on the ones before, and affects the ones after.

⃝c -Trenn, King’s College London 5

How to decide what to do

Start simple. Single decision.

Consider being offered a bet in which you pay £2 if an odd number is rolled on a die, and win £3 if an even number appears.

Is this a good bet?

⃝c -Trenn, King’s College London 6

How to decide what to do

Consider being offered a bet in which you pay £2 if an odd number is rolled on a die, and win £3 if an even number appears.

Is this a good bet?

To analyse this, we need the expected value of the bet.

⃝c -Trenn, King’s College London 7

How to decide what to do

We do this in terms of a random variable, which we will call X. X can take two values:

3 if the die rolls odd ́2 if the die rolls even

And we can also calculate the probability of these two values

P pX “ 3q “ 0.5 P pX “ ́2q “ 0.5

⃝c -Trenn, King’s College London 8

How to decide what to do

The expected value is then the weighted sum of the values, where the weights are the probabilities.

Formally the expected value of X is defined by: ÿ

ErXs “

where the summation is over all values of k for which P pX “ kq ‰ 0.

k

⃝c -Trenn, King’s College London 9

k ̈ PpX “ kq

How to decide what to do

Here the expected value is:

ErXs “ 3 ̈ 0.5 ` p ́2q ̈ 0.5 Thus the expected value of X, ErXs, is £0.5, and we take

this to be the value of the bet.

⃝c -Trenn, King’s College London 10

How to decide what to do

Do you take the bet?

Compare that £0.5 with not taking the bet. Not taking the bet has (expected) value £0

⃝c -Trenn, King’s College London 11

How to decide what to do

£0.5 is not the value you will get.

You can think of it as the long run average if you were offered the bet many times.

Again, even after a large number of rounds you won’t get that value (there will be some noise)

⃝c -Trenn, King’s College London 12

Sometimes the unlikely event can occur … it doesn’t mean the prediction was bad

(fivethirtyeight.com)

⃝c -Trenn, King’s College London 13

Example

Pacman is at a T-junction

Based on their knowledge, estimates that if they go Left:

‚ Probabilityof0.3ofgettingapayoffof10

‚ Probabilityof0.2ofgettingapayoffof1

‚ Withtheremainingprobabilityapayoffof-5

What is the expected value of Left?

⃝c -Trenn, King’s College London 14

Example

Pacman is at a T-junction

Based on their knowledge, estimates that if they go Left:

‚ Probabilityof0.3ofgettingapayoffof10

‚ Probabilityof0.2ofgettingapayoffof1

‚ Withtheremainingprobabilityapayoffof-5

What is the expected value of Left?

ErXs “ 0.3 ̈ 10 ` 0.2 ̈ 1 ` p1 ́ 0.3 ́ 0.2q ̈ p ́5q “ 3 ` 0.2 ́ 2.5 “ 0.7

⃝c -Trenn, King’s College London 15

How to decide what to do

Anotherbet: youget£1ifa2ora3isrolled,£5ifasixisrolled,andpay3 otherwise.

What’s the expected value?

⃝c -Trenn, King’s College London 16

How to decide what to do

Anotherbet: youget£1ifa2ora3isrolled,£5ifasixisrolled,andpay3 otherwise.

What’s the expected value?

ErXs “ 26 ̈ 1 ` 16 ̈ 5 ` 36 ̈ p ́3q “ ́13

⃝c -Trenn, King’s College London 17

How to decide what to do

What happens if you repeat this bet 10 times: you get £1 if a 2 or a 3 is rolled, £5 if a six is rolled, and pay 3 otherwise.

What’s the expected value now? (i.e., after all 10 games)

⃝c -Trenn, King’s College London 18

How to decide what to do

Let Xi,i P t1,2,…,10u and X “ ř10 Xi i“1

The expected value here is:

ErXis “ 26 ̈ 1 ` 16 ̈ 5 ` 36 ̈ p ́3q “ ́13

Thus, by linearity of expectation (i.e., ErαY ` Zs “ αErY s ` ErZs, for all Y, Z

and α),

ErXs “ E Xi i“1

“

⃝c -Trenn, King’s College London

19

«ff

10 10

ÿÿ

1 10 ErXis “ 10 ̈ ErXis “ ́10 ̈ 3 “ ́ 3

i“1

How an agent might decide what to do

Consider an agent with a set of possible actions A. Each a P A has a set of possible outcomes sa. Which action should the agent pick?

⃝c -Trenn, King’s College London 20

How an agent might decide what to do

The action a ̊ which a rational agent should choose is that which maximises the agent’s utility.

In other words the agent should pick:

a ̊ “ arg max upsaq, aPA

‚ where sa is the state obtained by choosing action a and ‚ upsaqistheutilityofthatstate

The problem is that in any realistic situation, the resulting state is probabilistic.

Instead we have to calculate the expected utility of each action and make the choice on the basis of that.

⃝c -Trenn, King’s College London 21

How an agent might decide what to do

In other words, for each action a with a set of outcomes sa, the agent should

calculate:

ÿ

Erupaqs “

and pick the best. Here: decide between Erupa1qs and Erupa2qs

s a2 s6 a1

s5

s1 Psa

ups1q. Prpsa “ s1q

s s4 s1 3

s2

⃝c -Trenn, King’s College London

22

How an agent might decide what to do

That is it picks the action that has the greatest expected utility. ‚ Therightthingtodo.

(40 Acres and a Mule Filmworks/Universal Pictures)

Here “rational” means “rational in the sense of maximising expected utility”.

⃝c -Trenn, King’s College London 23

Example

Pacman is at a T-junction

Based on their knowledge, estimates that if they go Left:

‚ Probabilityof0.3ofgettingapayoffof10 ‚ Probabilityof0.2ofgettingapayoffof1 ‚ Probabilityof0.5ofgettingapayoffof-5

If they go Right:

‚ Probabilityof0.5ofgettingapayoffof-5

‚ Probabilityof0.4ofgettingapayoffof3 ‚ Probabilityof0.1ofgettingapayoffof15

Should they choose Left or Right (MEU)?

⃝c -Trenn, King’s College London 24

Stochastic

Note that we are dealing with stochastic actions here. s a2 s6

a1

s5

s s4 s1 3

s2

A given action has several possible outcomes.

We don’t know, in advance, which one will happen.

⃝c -Trenn, King’s College London 25

Stochastic

(fivethirtyeight.com)

A lot like life.

⃝c -Trenn, King’s College London 26

Limitations of our notion of “rational”

Consider the following game. Let’s say your monthly income is m. W.p. 2{3 I double your income every month

With the remaining probability you have to give me your monthly income every month

ExpectedvalueifplayingErIncomes“2m23 `013 “ 34m Expected value if not playing ErIncomes “ m.

Would you play?

⃝c -Trenn, King’s College London 27