- EdisonCruise
**Posts:**122**Joined:**

I though Bellman Equation was solved by backwards because the terminal condition is easier to define than the initial condition. Until recently, I read the book Markov decision processes with application to finance by Nicole Bäuerle and Ulrich Rieder. It says ‘Due to the stationarity of the data however, it is not necessary to formulate the algorithm as a backward algorithm.’ Stationary means the data (especially, the transition probabilities) does not depend on time. If stationery, Bellman equation can be solved by forwards. I can’t understand its logics, can anyone give me an intuitive explanation? Thank you.

I will take a guess that it is a "trivial" rewrite. That is, if you a have a stationary PDE problem with a terminal condition at T, you can always trivially rewrite it with [$]\tau = T - t[$] and the terminal condition becomes an initial condition. Ditto for the Bellman problem.

- katastrofa
**Posts:**10256**Joined:****Location:**Alpha Centauri

I probably didn't understand the question, because my answer would be: if the dynamics of the system does not depend on time, i.e. is time invariant, describing the solution as backward/forward induction doesn't make sense. (BTW, backward induction would tell me not to post this )

Last edited by katastrofa on July 21st, 2019, 7:57 pm, edited 1 time in total.

There are various forms of the Bellman equation. I think of the whole Bellman thing as just being a principle that tells you how to solve some optimization.

In the stochastic world we are used to Bellman problems leading to diffusion equations. Diffusion equations can only be solved in one direction. Depending on the signs of the time and second spatial derivatives in the equation there is only one right way to solve, either forwards or backwards. Only one direction makes any sense. You can't arbitrarily change direction. So that amounts to a final or an initial condition. It's not a matter of something being "easier." (But as Alan says you can change the direction of time, but that doesn't change the real nature of the initial/final condition.) The problem will determine what makes sense as well.

Just because parameters are independent of time doesn't mean the solution is independent of time. So you can't necessarily change direction.

Perhaps you have a problem with time-independent parameters which also (for some reason) has a time-independent solution. In options the perpetual put springs to mind. But then the diffusion pde collapses to an ode, and there's no time direction at all.

But you also can have dual problems. E.g. in derivatives you have the BS eqn and then the Dupire eqn, the latter has different independent variables and is solved "the other way."

This is all probably irrelevant. Example, please.

In the stochastic world we are used to Bellman problems leading to diffusion equations. Diffusion equations can only be solved in one direction. Depending on the signs of the time and second spatial derivatives in the equation there is only one right way to solve, either forwards or backwards. Only one direction makes any sense. You can't arbitrarily change direction. So that amounts to a final or an initial condition. It's not a matter of something being "easier." (But as Alan says you can change the direction of time, but that doesn't change the real nature of the initial/final condition.) The problem will determine what makes sense as well.

Just because parameters are independent of time doesn't mean the solution is independent of time. So you can't necessarily change direction.

Perhaps you have a problem with time-independent parameters which also (for some reason) has a time-independent solution. In options the perpetual put springs to mind. But then the diffusion pde collapses to an ode, and there's no time direction at all.

But you also can have dual problems. E.g. in derivatives you have the BS eqn and then the Dupire eqn, the latter has different independent variables and is solved "the other way."

This is all probably irrelevant. Example, please.

- EdisonCruise
**Posts:**122**Joined:**

I cannot make a specific example, because I read that in a book. The followingsa are the images I took from Nicole Bäuerle and Ulrich Rieder' book. Maybe I can rephase the question, if the initial condition is well defined and the data is NON-stationary, can the Bellman equation be solved by forwards? Why?

page1

page2

page3

page1 https://ibb.co/1fhhK2m

page2 https://ibb.co/3WW69NW

page3 https://ibb.co/Zfxt1Ty

page1

page2

page3

page1 https://ibb.co/1fhhK2m

page2 https://ibb.co/3WW69NW

page3 https://ibb.co/Zfxt1Ty

- katastrofa
**Posts:**10256**Joined:****Location:**Alpha Centauri

Theoretically they can be, but in practice you often experience the dimensionality problem (I'm not sure if it applies to finance applications you have in mind).

- EdisonCruise
**Posts:**122**Joined:**

Thank you katstrofa. I aslo think the Bellman equation can be solve by forwards with well defined and NON-stationary data. But that's in contrast to Nicole Bäuerle and Ulrich Rieder' book. I cannot understand why the stationarity of data is related to forward/backward solution.

- katastrofa
**Posts:**10256**Joined:****Location:**Alpha Centauri

Imho, it's simply because in general when the dynamics of a system is time invariant, you can't define backward and forward. But let's see what the experts will tell you.

(I'm a bit confused by the problem because the stationarity is usually achieved in the infinite time horizon, and you seem to be discussing finite times...)

(I'm a bit confused by the problem because the stationarity is usually achieved in the infinite time horizon, and you seem to be discussing finite times...)

I cannot make a specific example, because I read that in a book.

page1 https://ibb.co/1fhhK2m

page2 https://ibb.co/3WW69NW

page3 https://ibb.co/Zfxt1Ty

I looked briefly. I think I was right: with stationary transition densities, it's a trivial relabeling: [$]n \rightarrow n' = N - n[$].

- EdisonCruise
**Posts:**122**Joined:**

Thank you all, but I what I cannot understand is the real reason that Bellman equation is ususally solved by backwards. Can any one give an exmaple in which both intitial and terminal conditions are well defined, but the Bellman equation can only be solved by backwards?

It's usually solved backwards because of the "principle of optimality", which you can google if you don't know it.

The standard finance example is determining an optimal exercise strategy for an American-style put. It's day 1 and the put expires on day N. Whatever the "strategy" on day 1, it should be optimal in light of what you are going to do on day 2, day 3, etc. How to account for all that, which at first glance seems hopelessly complicated?

Well, you know what you are going to do on expiration = day N. You'll exercise if the option is in-the-money. So, it's easiest to start by calculating what you will do on day N-1. What will you do?. Conditional on the state (say just the day N-1 stock price) you first calculate the (discounted) expected value of the option on expiration. Then, if exercising will get you more than that expectation, you exercise; otherwise you wait till expiration and take your chances on the random draw of the terminal stock price. Doing that calculation will tell you the optimal exercise policy for day N-1 (for every possible state).

Then, you step back another day and repeat. The net result at the end (having finally calculated down to day 1) is that you have indeed satisfied the "principle of optimality".

Only the terminal condition is known before solving the problem. But, you can certainly relabel the day counting as I said in previous post, so that the terminal condition becomes an initial condition. The important thing is to appreciate why the problem proceeds in the normal way, backwards from expiration. The relabeling of the day count is just for possible convenience -- say of a computer program.

The standard finance example is determining an optimal exercise strategy for an American-style put. It's day 1 and the put expires on day N. Whatever the "strategy" on day 1, it should be optimal in light of what you are going to do on day 2, day 3, etc. How to account for all that, which at first glance seems hopelessly complicated?

Well, you know what you are going to do on expiration = day N. You'll exercise if the option is in-the-money. So, it's easiest to start by calculating what you will do on day N-1. What will you do?. Conditional on the state (say just the day N-1 stock price) you first calculate the (discounted) expected value of the option on expiration. Then, if exercising will get you more than that expectation, you exercise; otherwise you wait till expiration and take your chances on the random draw of the terminal stock price. Doing that calculation will tell you the optimal exercise policy for day N-1 (for every possible state).

Then, you step back another day and repeat. The net result at the end (having finally calculated down to day 1) is that you have indeed satisfied the "principle of optimality".

Only the terminal condition is known before solving the problem. But, you can certainly relabel the day counting as I said in previous post, so that the terminal condition becomes an initial condition. The important thing is to appreciate why the problem proceeds in the normal way, backwards from expiration. The relabeling of the day count is just for possible convenience -- say of a computer program.

- EdisonCruise
**Posts:**122**Joined:**

Thank you Alan. I can understand the option pricing problem. The BS pde must be solved by backwards, only because the terminal condition, i.e., the option pay off, is well defined. The initial option price is unknown(or cannot be difined), so need to be solved. This is not because of the data is non-stationary, isn't it? Actually, I think the data is stationary in option pricing problem.

Maybe I can propose a specific example: the market making problem. In this problem, I think the initial wealth or utility of market maker can be defined and the data should be stationary. However, as I read in most papers for this problem, Bellman equation or HJB PDE is only solved by backwards. An example is MARCO AVELLANEDA and SASHA STOIKOV's paper High-frequency trading in a limit order book.

So, in market making problem, can the Bellman quation be solved by forwards?

Maybe I can propose a specific example: the market making problem. In this problem, I think the initial wealth or utility of market maker can be defined and the data should be stationary. However, as I read in most papers for this problem, Bellman equation or HJB PDE is only solved by backwards. An example is MARCO AVELLANEDA and SASHA STOIKOV's paper High-frequency trading in a limit order book.

So, in market making problem, can the Bellman quation be solved by forwards?

- Cuchulainn
**Posts:**65000**Joined:****Location:**Drosophila melanogaster-
**Contact:**

The BS pde is not a good example as it is just a matter of how you define time. Nothing shocking.

Is the essential issue not open-loop versus closed-loop control problems? It is a nonlinear problem?

Is the essential issue not open-loop versus closed-loop control problems? It is a nonlinear problem?

"Compatibility means deliberately repeating other people's mistakes."

David Wheeler

http://www.datasimfinancial.com

http://www.datasim.nl

David Wheeler

http://www.datasimfinancial.com

http://www.datasim.nl

Any Bellman equation can be solved forward in time at least approximately, using a function approximation for the value function. But if you can use the backward induction, you should, because it will be more efficient.

- katastrofa
**Posts:**10256**Joined:****Location:**Alpha Centauri

You can solve it backwards only if you know the model (transition probabilities & reward function). As ISayMoo wrote, you can always solve it forward. If you do not know the model, you simply sample the space of policies until you converge to the optimal one. It's lessThank you Alan. I can understand the option pricing problem. The BS pde must be solved by backwards, only because the terminal condition, i.e., the option pay off, is well defined. The initial option price is unknown(or cannot be difined), so need to be solved. This is not because of the data is non-stationary, isn't it? Actually, I think the data is stationary in option pricing problem.

Maybe I can propose a specific example: the market making problem. In this problem, I think the initial wealth or utility of market maker can be defined and the data should be stationary. However, as I read in most papers for this problem, Bellman equation or HJB PDE is only solved by backwards. An example is MARCO AVELLANEDA and SASHA STOIKOV's paper High-frequency trading in a limit order book.

So, in market making problem, can the Bellman quation be solved by forwards?