Dynamic programming and Bellman Equation

 

We consider the following maximization problem

         

where

 on probability of p11 from state 1 to state 1 and p12 from state 2 to state 1

 on probability of p21 from state 1 to state 2 and p22 from state 2 to state 2

 

where we suppose that they are given. Here, p12=1-p11 and p22=1-p21.

 

The corresponding Bellman equations are

 

and

We guess

 

 and

 

That is

and

 

FOC’s w.r.t. are

 

        (*)

and

        (**)

 

On the other hand, FOC’s w.r.t. are

For the first one

       (***)

For the second one, in the same way,

       (****)

 

Plugging (*) and (**) into Bellman equation with and at optimum, we get

 

 

They are

 

Comparing coefficients in each kind, we get

 

They are

So

 

They are going to be

 

 

They are exactly the same as what we got yesterday. From F and H, we can calculate

 

 

 

 

for and  respectively. They are constant and not changed. And then we get

 

 

 

Actually, they are

and

 

where they are not changed at all even though we have different probabilities.

 

The rest is

after removing  from each term on the right hand side.

Then

 

Plugging the latter into the former, we get

 

We solve out and we can say that the guess is right.