March 5th, 2008, 11:23 pm
This problem belongs to the class of controled markov chain, but as pointed out there is a smart way to solve it.First, the intuition: Since p > 1/2, we are trending up. So we want to minimize the variance of the output by minimizing the size of the bet, thus betting the minimal bet 1.Second, some formalism.Let: {1, ... N} the state space and {A} be the event: the player wins. Introduce further X_0 the wealth of the agent and define:z_i := E( A | X ) -- since the betting are independent, X does not depend on time, neither does z_i.likewise, the set of admissible controls alpha(i) do not depend on time.Let alpha* be an optimal control (exists since we are optimizing on a finite set), and suppose that there exists i such that alpha*(i) > 1.We have:z_i = p z_{i+alpha*(i)} + (1-p) z_{i-alpha*(i)}.Take j, such that 1 <= j < alpha(i)By optimality,p z_{i+alpha*(i)} + (1-p) z_{i-alpha*(i)} >= pz_{i+j) + (1-p) z_{i-j} (1)Since p > 1/2 and z_i is obviously increasing,z_{i+j} >= p z_{i+alpha*(i)} + (1-p) z_i > 1/2 ( z_{i+alpha*(i)} + z_i) (2)Likewise,z_{i-j} > 1/2 ( z_{i - alpha*(i)} + z_i ) (2 bis)Using (2) and (2bis) in the right hand side of (1), one gets:z_i > p ( 1/2 ( z_{i+alpha*(i)} + z_i)) + (1-p) (1/2 ( z_{i-alpha*(i)} + z_i))i.e. z_i > 1/2 zi + 1/2 ( p z_{i+alpha*(i)} + (1-p) z_{i-alpha*(i)})that is to say:z_i > z_iThis is a contradiction, which in turn yield the result.