Optimal Gambling

TraderJoe · March 4th, 2008, 2:26 pm

A gambler has i pounds and wants to increase this to N. At each stage she can bet any fraction of her capital, say j ≤ i. Either she wins, with probability p, and now has i+jpounds, or she loses, with probability q = 1 − p, and has i − j pounds. Let the state space be {0, 1, . . . ,N}. The game stops upon reaching state 0 or N. The only non-zeroreward is 1, upon reaching state N. Suppose p ≥ 1/2. Prove that the timid strategy, of always betting only 1 pound, maximizes the probability of the gambler attaining N pounds.

MikeCrowe · March 4th, 2008, 3:55 pm

Not a proof, but some thoughts1) It is never optimum to bet larger than N-i, since if she wins the bet she has the same payoff as betting exactly N-i, while if she loses she is further from N.2) If she uses a constant bet size then the balance follows a simple binary tree random walk, stepping up or down by the bet size. She will stop if she runs out of money, or hits N.If p is strictly greater than 1/2 then the expected balance at a future time will drift upwards, hence in order to maximise the probability of reaching N, she should maximise the number of goes she makes, she does this by making the smallest bet possible.But, I don't know why this still holds for exactly 1/2? This would appear to be indifferent to the step size, and she should be able to do just as well by betting min(N-i,i).

TraderJoe · March 4th, 2008, 10:31 pm

That's cool MikeCrowe. We have to write down the optimality equation:F(i) = max j,j≤i {pF(i + j) + qF(i − j)} .To show that the timid strategy is optimal we need to find its value function, say G(i), and show that it is a solution to the optimality equation. We have G(i) = pG(i + 1) + qG(i − 1), with G(0) = 0, G(N) = 1. etc.

MikeCrowe · March 5th, 2008, 10:06 am

Cool.So... that shows that the timid strategy is optimal for all p >= 1/2.The question then is, is it uniquely optimal for p=1/2. I would still expect that min(N-i,i,a) for any integer a > 0 would also be optimal.Logically, it would also make sense that if p<1/2 you should bet min(N-i,i). I am not totally familiar with your proof, but would it be possible to repeat for the strategy min(N-i,i) for F(i) = max j,j≤i {pF(i - j) + qF(i + j)} to show that min(N-i,i) is the optimal for all p <= 1/2.Hence both would "overlap" at p = 1/2This reminds me of a study that was done into the "weakest link" gameshow. I don't have a reference for the paper, but they investigated the optimal strategies for playing the game and found that there were two optimal strategies: Bank after every question if your team is thick, or never bank if your team is smart. Again this is presumably about the idea of whether you want to maximise or minimise the length of your random walk.

iwanttobelieve · March 5th, 2008, 11:23 pm

This problem belongs to the class of controled markov chain, but as pointed out there is a smart way to solve it.First, the intuition: Since p > 1/2, we are trending up. So we want to minimize the variance of the output by minimizing the size of the bet, thus betting the minimal bet 1.Second, some formalism.Let: {1, ... N} the state space and {A} be the event: the player wins. Introduce further X_0 the wealth of the agent and define:z_i := E( A | X ) -- since the betting are independent, X does not depend on time, neither does z_i.likewise, the set of admissible controls alpha(i) do not depend on time.Let alpha* be an optimal control (exists since we are optimizing on a finite set), and suppose that there exists i such that alpha*(i) > 1.We have:z_i = p z_{i+alpha*(i)} + (1-p) z_{i-alpha*(i)}.Take j, such that 1 <= j < alpha(i)By optimality,p z_{i+alpha*(i)} + (1-p) z_{i-alpha*(i)} >= pz_{i+j) + (1-p) z_{i-j} (1)Since p > 1/2 and z_i is obviously increasing,z_{i+j} >= p z_{i+alpha*(i)} + (1-p) z_i > 1/2 ( z_{i+alpha*(i)} + z_i) (2)Likewise,z_{i-j} > 1/2 ( z_{i - alpha*(i)} + z_i ) (2 bis)Using (2) and (2bis) in the right hand side of (1), one gets:z_i > p ( 1/2 ( z_{i+alpha*(i)} + z_i)) + (1-p) (1/2 ( z_{i-alpha*(i)} + z_i))i.e. z_i > 1/2 zi + 1/2 ( p z_{i+alpha*(i)} + (1-p) z_{i-alpha*(i)})that is to say:z_i > z_iThis is a contradiction, which in turn yield the result.