Howardfs policy iteration(policy improvement algorithm)

 

We continue to consider the following maximization problem

 

          subject to

 

where we assume that

 

and

 

where

 

 

for

 

We convert it into the following Bellman equation

 

 

We firstly guess the policy function as

 

   for

 

We then calculate the value of k at time t in terms of the value of k at time 0.

     for some .

 .

 .

 .

where

and

 

starting from  and  when t=1 for some .

 

The rest is almost the same as what we did yesterday.

 

That is

Thus

where

where

and

 

starting from  and  when t=1 for some

 

and

Here,  can be calculated in the following way

 

 

 

 .

 .

 .

As , it will become

thatfs what we intend to set.