Value function iteration

 

We continue to consider the same following Bellman equation

 

 

where

 

 

for

 

and calculate the coefficients of value function each step and take the limits.

 

We start from E[V0|theta]=0 and k’=0.

 

When j=0

 

k’=0

 

as in

 

When j=1

 

 

When j=2

 

 

When j=3

 

 

 .

 .

 .

 

The difficult part is the last part of c1 and c3.

 

Firstly, the last part of c1 can be expressed in terms of c3 as

 

 

where the previous value of c3 times.

 

All we have to know is c3.

 

If we ignore the higher terms because all the parameters and their absolute values

 

are located between 0 and 1, we can regard

 

 

ignoring . In the same way, we have

 

 

because we have the terms up to second orders. Similarly, we can regard

 

 

because there are the terms up to third orders.

 

Therefore

 

 

Here, we can regard  in the limit so we rewrite it as

 

For

 

in the limit as we increase the number of steps

 

 

that are exactly the same as the results in 0511 by the method of guess and verify.