I have struggled with issues of when Ito subtleties complicate things and when they don't for many years -- and I still struggle. I find the text-books very difficult either because they use very esoteric math (what we used to call "modern" math when I was student) that has a very steep learning curve for me and I don't have the time or because the simpler books tend to gloss over the technical details since they are presenting results they know to be correct. I would really like to see a thread that explains all the critical issues that commonly arise. Just as a suggestion. What I have found is that the issue lies with \sqrt{\delta t} making the infinitessimal limit problematic. A Taylor series, however, that always has terms (\delta t)^n, does not face this problem. Likewise, as long as the distribution is centrally symmetric, moments will also always be integer powers of \delta t. In general, only quantities that do not involve fractional powers of \delta t can be easily converted to differential equations in the infinitessimal limit without worrying about Ito. This seems like a simple idea to me, but I have never seen it made explicit in any text-book. This may not be a sufficient way of expressing the issue, but it seems to me it goes a long way towards it. However, it still does not explain the problem with time-dependent Fokker-Planck which, Paul for instance, derives without Ito subtleties using only conventional calculus and without requiring Gaussian increments, yet the Ito specialists cite the Levy characterization to claim it is valid only for Gaussian/Wiener increments.Can someone explain this apparent contradiction in simple terms?

Last edited by Fermion on January 10th, 2012, 11:00 pm, edited 1 time in total.

OK, here is my explanation -- it is historical.You can get quite far without Ito's stochastic calculus. The Kolmogorov backwards equation (KBE) wasdeveloped by Kolmogorov in the early 1930's. I haven't read the original work (in German), but he was just starting from what we now call the Chapman-Kolmogorov equation. The latteris an obvious 'conservation of probability' relation. Starting from that, you get the KBE by only requiring that (in the absence of jumps): E[X(t+DeltaT) - X(t)|X(t)=x] -> b(x) DeltaT and E[(X(t+DeltaT) - X(t))^2|X(t)=x] -> a^2(x) DeltaT as DeltaT -> 0.No dt^(1/2) appear.The Fokker-Planck equation is the formal adjoint of the KBE and can be obtained from it by parts integration, at leastin relatively simple cases. Again, no dt^(1/2).The interpretation of the solutions of these equations as averages over paths was done by Feynman and Kac in the 1940's - early 1950's (also Hibbs). Again, no dt^(1/2).In the 1970's, when I studied physics, this was pretty much how things were taught.Although Ito created his SDE's in the early 1940's and proved Ito's lemma in 1951, it tooka long time for his work to permeate into classrooms and standard texts. The application of Ito's work by Merton and Black and Scholes in the early 1970's was very influential inthis sea change in how things are viewed and taught.At this stage it is standard and non-controversial. (*) dX = b(X) dt + a(X) dW is short-hand for a rigorous theory of integration and dW is a Wiener-path increment. If b vanishes, you are describing a (local) martingale.It has taken hold because it is intuitive, it makes (some) earlier theory much easier, it is easy to include jumps, andhas made a lot of advances and extensions easier. It is an alternative 'way of thinking' about diffusions and continuous-time Markov processes in general.The simpled-minded algebra associated with (*) is dt^2 = dt x dW = 0 and dW^2 = dt, andyou can use that to 'derive' Ito's lemma for f(t,X(t)) by Taylor expansion. So, now we have a dt^(1/2).The Feynman-Kac business is much, much easier with Ito's theory than without. Kolmogorov's method for the backwards equation is still fine, but so is Ito's, and they lead to the same thing. There are indeed lots of subtleties in the theory, but IMO you can leave most of the fine points to the probabilists (I do). If you ever studied physics, here is an analogy. Ito is to stochastic processes what Feynman diagrams are to QED. Before Feynman, QED was very difficult and maybe a half dozen people could understand it. After Feynman diagrams, every idiotcould do calculations. Same idea with stochastic calculus. If you never studied physics, here is another analogy. The dt^(1/2) business is analogous to Sqrt[-1]. When originally introduced, Sqrt[-1] was very controversial, and equivalently 'subtle'. As more and more people saw the advantages of this new idea, it became standard.Nowdays, complex analysis is taught routinely, and it is the easiest route to solve many problems.

Last edited by Alan on January 12th, 2012, 11:00 pm, edited 1 time in total.

Thanks Alan. The historical perspective certainly provides useful insight. Likewise your analogies.So am I correct in reading that you agree that the Fokker-Planck eqn does not require Gaussian increments? Or is there still something I'm missing?

You're welcome.By 'require Gausssian increments', I'm not sure if you mean 'require Gausssian increments to derive it from some approximating process',or something else. I suspect something else.I would explain it this way:The Ito sde (*) I posted has Wiener (aka Brownian or Gaussian) increments.It describes a diffusion and has the usual Fokker-Planck eqn (FPE) for the transition density p(t,x,y).Conversely, each FPE, ignoring boundary effects, is associated to an Ito sde with the same coefficients. And, that Ito sde, again, has infinitesimal Wiener increments dW. Having said that, both the FPE and the sde (*) can also be thought of as a continuous limitof various (Gaussian or non-Gaussian) processes. For example, you could have a (sequence of) binomial tree processes, each of which is certainly non-Gaussian. But the limiting process is a diffusion described by (*) again.Or, you could have a sequence of discrete-time random walks, where each step is drawn from some weird density p(.) withcontinuous support. If these sequences tends to a limit, as DeltaT->0, where the first two moments match the moment relationI posted, then the result is the same Fokker-Planck equation. The only other possibility, for a sensible continuous-time limit, is either a pure jump process or a combination of diffusion and jumps. AFAIK, that exhausts the possibilities for a continuous-timelimit. If you exclude jumps, then the only possible limit is the diffusion associated to (*) -- Wiener increments again. So ... the limiting theory (the diffusion) definitely has Wiener increments dW.Moreover, that is the only possible limiting theory in the absence of jumps.Some discrete-time and/or discrete-space approximating process need not have Wiener increments, except in the limit.

Last edited by Alan on January 14th, 2012, 11:00 pm, edited 1 time in total.

"Can someone explain this apparent contradiction in simple terms? "The best way to get the answer is1. to specify a contradiction2. specify terms in which one expects to get explanation.

Thanks again Alan. I am focusing on your statement Quoteyou could have a sequence of discrete-time random walks, where each step is drawn from some weird density p(.) with continuous support. If these sequences tends to a limit, as DeltaT->0, where the first two moments match the moment relation I posted, then the result is the same Fokker-Planck equationWould it be fair to describe the situation in the following way:If we have the SDEdx = a(x,t)dt + b(x,t) dZ(t) where E[dZ(t)] = 0, E[dZ(t)^2] = dt, then the Fokker-Planck eqn can be written in terms of a(x,t) and b(x,t) in the usual way and the Levy characterization says that dZ(t) = dW (a Gaussian) if dZ(t) is independent of t. (so that dZ(t) is i.i.d. for all t).In other words, the FPE may still apply even if dZ(t) is not Gaussian (as long as it converges to a Gaussian in the steady-state), but the penalty of writing the SDE above, where a(x,t) and b(x,t) also define the FPE, is that dZ(t) is time-dependent (presumably in a very specific way for a specific solution of the FPE)?

Last edited by Fermion on January 14th, 2012, 11:00 pm, edited 1 time in total.

Eq. dx = a(x,t)dt + b(x,t) dZ(t) does not make sense if E[dZ(t)] = 0, E[dZ(t)^2] = dt,You need to say more about Z. This eq is interpreted only in integrable sense.dZ(t) = dW (a Gaussian) if dZ(t) is independent of t. (so that dZ(t) is i.i.d. for all t).dW does not random variable threfore the statement dW(t) is independent on t is formally incorrect as well as dZ(t) is i.i.d. for all t.In general it is proved that solution a SDE with dW stoch integral is a diffusion process in the broad sense. There exists some subtleties if we deal with density of the SDE solution. These subtleties relate to degeneration of diffusion and smoothness of coefficients,

QuoteOriginally posted by: listEq. dx = a(x,t)dt + b(x,t) dZ(t) does not make sense if E[dZ(t)] = 0, E[dZ(t)^2] = dt,You need to say more about Z. This eq is interpreted only in integrable sense.dZ(t) = dW (a Gaussian) if dZ(t) is independent of t. (so that dZ(t) is i.i.d. for all t).dW does not random variable threfore the statement dW(t) is independent on t is formally incorrect as well as dZ(t) is i.i.d. for all t.In general it is proved that solution a SDE with dW stoch integral is a diffusion process in the broad sense. There exists some subtleties if we deal with density of the SDE solution. These subtleties relate to degeneration of diffusion and smoothness of coefficients,Ok. Let me re-write it as this:If we have a discrete process\delta x = a(x,t) \delta t + b(x,t) \delta Z(t)where E[\delta Z(t)] = 0, E[\delta Z(t)^2] = \delta t, where \delta t can be arbitrarily small and the density of x is a continuous function of t, then the Fokker-Planck eqn can be written in terms of a(x,t) and b(x,t) in the usual way and the Levy characterization says that \delta Z(t) -> dW (a Gaussian) in the limit \delta -> dt if \delta Z(t) is independent of t. (so that \delta Z(t) is i.i.d. for all t).In other words, the FPE may still apply even if \delta Z(t) is not Gaussian (as long as it converges to a Gaussian in the steady-state when \delta t -> dt), but the penalty of writing the discrete process above, where a(x,t) and b(x,t) also define the FPE, is that \delta Z(t) is time-dependent (presumably in a very specific way for a specific solution of the FPE)?

first you need specify Z though when we use delta form it cam be any with properties E[\delta Z(t)] = 0, E[\delta Z(t)^2] = \delta t .if we consider limit when delta tends to 0 we can do this when delta notation is applied for the integral sum on [ 0 , T ]. Here is two different limit transitions when delta tends 0 and when Z approaches to W and FPE is valid only for W. Formally it is not still clear what you wish to state. Mathematics distincts from say finance that we should formulate everything formally what we wish to say. Then it can be either true or false. The statement whethere something is true is open while it is not proved. Though it can be looks like true.

QuoteOriginally posted by: FermionQuoteOriginally posted by: listEq. dx = a(x,t)dt + b(x,t) dZ(t) does not make sense if E[dZ(t)] = 0, E[dZ(t)^2] = dt,You need to say more about Z. This eq is interpreted only in integrable sense.dZ(t) = dW (a Gaussian) if dZ(t) is independent of t. (so that dZ(t) is i.i.d. for all t).dW does not random variable threfore the statement dW(t) is independent on t is formally incorrect as well as dZ(t) is i.i.d. for all t.In general it is proved that solution a SDE with dW stoch integral is a diffusion process in the broad sense. There exists some subtleties if we deal with density of the SDE solution. These subtleties relate to degeneration of diffusion and smoothness of coefficients,Ok. Let me re-write it as this:If we have a discrete process\delta x = a(x,t) \delta t + b(x,t) \delta Z(t)where E[\delta Z(t)] = 0, E[\delta Z(t)^2] = \delta t, where \delta t can be arbitrarily small and the density of x is a continuous function of t, then the Fokker-Planck eqn can be written in terms of a(x,t) and b(x,t) in the usual way and the Levy characterization says that \delta Z(t) -> dW (a Gaussian) in the limit \delta -> dt if \delta Z(t) is independent of t. (so that \delta Z(t) is i.i.d. for all t).In other words, the FPE may still apply even if \delta Z(t) is not Gaussian (as long as it converges to a Gaussian in the steady-state when \delta t -> dt), but the penalty of writing the discrete process above, where a(x,t) and b(x,t) also define the FPE, is that \delta Z(t) is time-dependent (presumably in a very specific way for a specific solution of the FPE)?My comment is that you can't have it both ways.In your discrete-time process, \delta Z(t) can be whatever you want. But this is a discrete-time process, so the FPE does not hold.In the continuous-time limit, the FPE holds if you make the first two moments behave correctly.But the only well-established limit (in the absence of jumps) is the Ito SDE as laboriously discussed above.

QuoteOriginally posted by: AlanQuoteOriginally posted by: FermionQuoteOriginally posted by: listEq. dx = a(x,t)dt + b(x,t) dZ(t) does not make sense if E[dZ(t)] = 0, E[dZ(t)^2] = dt,You need to say more about Z. This eq is interpreted only in integrable sense.dZ(t) = dW (a Gaussian) if dZ(t) is independent of t. (so that dZ(t) is i.i.d. for all t).dW does not random variable threfore the statement dW(t) is independent on t is formally incorrect as well as dZ(t) is i.i.d. for all t.In general it is proved that solution a SDE with dW stoch integral is a diffusion process in the broad sense. There exists some subtleties if we deal with density of the SDE solution. These subtleties relate to degeneration of diffusion and smoothness of coefficients,Ok. Let me re-write it as this:If we have a discrete process\delta x = a(x,t) \delta t + b(x,t) \delta Z(t)where E[\delta Z(t)] = 0, E[\delta Z(t)^2] = \delta t, where \delta t can be arbitrarily small and the density of x is a continuous function of t, then the Fokker-Planck eqn can be written in terms of a(x,t) and b(x,t) in the usual way and the Levy characterization says that \delta Z(t) -> dW (a Gaussian) in the limit \delta -> dt if \delta Z(t) is independent of t. (so that \delta Z(t) is i.i.d. for all t).In other words, the FPE may still apply even if \delta Z(t) is not Gaussian (as long as it converges to a Gaussian in the steady-state when \delta t -> dt), but the penalty of writing the discrete process above, where a(x,t) and b(x,t) also define the FPE, is that \delta Z(t) is time-dependent (presumably in a very specific way for a specific solution of the FPE)?My comment is that you can't have it both ways.In your discrete-time process, \delta Z(t) can be whatever you want. But this is a discrete-time process, so the FPE does not hold.In the continuous-time limit, the FPE holds if you make the first two moments behave correctly.Isn't that what I did when I wrote E[\delta Z(t)] = 0, E[(\delta Z(t))^2] = \delta t (assuming \delta Z independent of x) and took the infinitessimal case \delta t->dt? Actually I don't even need that. I just need lim(\delta t->0)E[\delta Z(t)]/(\delta t) = 0 lim(\delta t->0) E[(\delta Z(t))^2]/(\delta t) = 1don't I?QuoteBut the only well-established limit (in the absence of jumps) is the Ito SDE as laboriously discussed above.Isn't the problem with other (non-Gaussian) cases due to an assumption that for all finite moments n,lim(\delta t->0) E[\delta Z(t)^n]/(\delta t) is independent of t? I'm sorry if I'm trying your patience with this, but I want to get things right.

Last edited by Fermion on January 15th, 2012, 11:00 pm, edited 1 time in total.

QuoteOriginally posted by: FermionIsn't the problem with other (non-Gaussian) cases due to an assumption that for all finite moments n,lim(\delta t->0) E[\delta Z(t)^n]/(\delta t) is independent of t? I'm sorry if I'm trying your patience with this, but I want to get things right.I don't think so. You could start with a discrete-time increment process:\delta x = delta Z(t,x)Then, to recover a diffusion, as DeltaT -> 0, you needE[delta Z(t,x)] -> b(x,t) DeltaT and E[(delta Z(t,x) - b(x,t) DeltaT)^2] -> a^2(x,t) DeltaT So, the moments of delta Z(t,x) *do*, in general, depend on t, and if they do, you get t-dependence in a( ) and b( ).

Last edited by Alan on January 15th, 2012, 11:00 pm, edited 1 time in total.

QuoteOriginally posted by: AlanQuoteOriginally posted by: FermionIsn't the problem with other (non-Gaussian) cases due to an assumption that for all finite moments n,lim(\delta t->0) E[\delta Z(t)^n]/(\delta t) is independent of t? I'm sorry if I'm trying your patience with this, but I want to get things right.I don't think so. You could start with a discrete-time increment process:\delta x = delta Z(t,x)Then, to recover a diffusion, as DeltaT -> 0, you needE[delta Z(t,x)] -> b(x,t) DeltaT and E[(delta Z(t,x) - b(x,t) DeltaT)^2] -> a^2(x,t) DeltaT So, the moments of delta Z(t,x) *do*, in general, depend on t, and if they do, you get t-dependence in a( ) and b( ).It looks like you misunderstood me. I understand that, in general, the moments of \delta Z may be time-dependent. In fact that is what I am suggesting as may be necessary for the situation I have been trying to address. I was responding to your statement about "the only well-established limit (in the absence of jumps) is the Ito SDE " and I assumed you were referring to the Levy characterization suggesting that it is not always possible to take the limit we have been discussing. So what I thought you meant was that it is not well-established that we can obtain the limit lim(\delta t->0) E[\delta Z(t)^2]/(\delta t) = 1except in the case that \delta Z becomes Gaussian as \delta t -> 0.For a Gaussian, all moments are constant and so independent of t. My conjecture is that the Levy characterization that dW (to use the Ito shorthand) must be Gaussian does not apply merely because E[dW^2]/dt = 1, but that it also assumes something along the lines of dW being i.i.d. for all t. Then I was wondering if perhaps allowing that the distribution of \delta Z is t-dependent even in the limit \delta t -> 0 may be key to taking the appropriate limit for the moments of \delta Z . Or are you arguing that this t-dependence in the distribution of \delta Z as \delta t -> 0, does not allow us to escape the Gaussian requirement?

Last edited by Fermion on January 15th, 2012, 11:00 pm, edited 1 time in total.

OK, I'm am tired of this, so here is my last version of the argument.I am saying the IID Gaussian interpretation of the DeltaT->0 limit is a _consequence_ of things you agree with.Here is my argument for that.We start with(*) \delta x = delta Z'(t,x), where delta Z'(t,x) is some density.The draws from delta Z'(t,x) are NOT IID draws; they involve a distribution that depends on t and x.Then, to recover a diffusion, as DeltaT -> 0, you need (need means *must have*)E[delta Z'(t,x)] -> b(x,t) DeltaT and E[(delta Z'(t,x) - b(x,t) DeltaT)^2] -> a^2(x,t) DeltaT These last two eqns are short-hand for saying that there exist functions b(x,t) and a(x,t) > 0 such that(L1) lim(DeltaT->0) E[delta Z'(t,x)]/DeltaT = b(x,t) (L2) lim(DeltaT->0) E[(delta Z'(t,x) - b(x,t) DeltaT)^2]/DeltaT = a^2(x,t) You have agreed with this, so far, in previous posts.Now, we just introduce some rescaling:Define delta Z(t,x) = (delta Z'(t,x) - b(x,t) DeltaT)/a(x,t)This is just a notation change. In the new notation, (*) reads\delta x = b(x,t) DeltaT + a(x,t) delta Z(t,x).In the new notation, (L2) says(L2*) lim(DeltaT->0) E[(delta Z(t,x))^2]/DeltaT = 1So, (L2*) follows from things you have agreed with. In particular, the right-hand-side must be 1, not some function of t, not anything else: 1 The aren't any other 'hidden assumptions'. There was never any assumption made that the draws fromdelta Z'(t,x) were IID. I think it is fine to interpret the result (L2*) as saying the 'infinitesimal draws' fromthe limiting rescaled distribution delta Z are 'IID infinitesimal Gaussian draws'. But this result was never assumed. It is a _consequence_ of relations that you have already agreed to.Since I am done, feel free to have the last word. I'm sure we'll re-hash the argument in a couple years.

Last edited by Alan on January 15th, 2012, 11:00 pm, edited 1 time in total.

QuoteOriginally posted by: AlanOK, I'm am tired of this, so here is my last version of the argument.I am saying the IID Gaussian interpretation of the DeltaT->0 limit is a _consequence_ of things you agree with.Here is my argument for that.We start with(*) \delta x = delta Z'(t,x), where delta Z'(t,x) is some density.The draws from delta Z'(t,x) are NOT IID draws; they involve a distribution that depends on t and x.Then, to recover a diffusion, as DeltaT -> 0, you need (need means *must have*)E[delta Z'(t,x)] -> b(x,t) DeltaT and E[(delta Z'(t,x) - b(x,t) DeltaT)^2] -> a^2(x,t) DeltaT These last two eqns are short-hand for saying that there exist functions b(x,t) and a(x,t) > 0 such that(L1) lim(DeltaT->0) E[delta Z'(t,x)]/DeltaT = b(x,t) (L2) lim(DeltaT->0) E[(delta Z'(t,x) - b(x,t) DeltaT)^2]/DeltaT = a^2(x,t) You have agreed with this, so far, in previous posts.Now, we just introduce some rescaling:Define delta Z(t,x) = (delta Z'(t,x) - b(x,t) DeltaT)/a(x,t)This is just a notation change. In the new notation, (*) reads\delta x = b(x,t) DeltaT + a(x,t) delta Z(t,x).In the new notation, (L2) says(L2*) lim(DeltaT->0) E[(delta Z(t,x))^2]/DeltaT = 1So, (L2*) follows from things you have agreed with. In particular, the right-hand-side must be 1, not some function of t, not anything else: 1 It looks like we agree on everything up to here. In fact it is very straightforward and never been in doubt by either of us except I fail to see what you gain by introducing Z'. That the first two moments of delta Z divided by DeltaT become 0 and 1 as DeltaT->0 has always been a fundamental requirement of mine right from the start and had no need of derivation from your Z'. The question I have regards possible t-dependence in the higher moments n > 2. (Sorry I didn't make this clearer; I thought it was obvious that is what I intended.) Given that the first two moments are 0 and 1, any t-dependence in the distribution has to come from the higher moments does it not?I really do not understand why you introduced Z' at all, unless it stems from your moment conditions on delta x, whereas mine were on delta Z (equivalent to yours, assuming dZ is independent of x).QuoteThe aren't any other 'hidden assumptions'. There was never any assumption made that the draws fromdelta Z'(t,x) were IID.My question about identical distributions (no t-dependence in the infinitessimal limit) were about delta Z not your delta Z'.Quote I think it is fine to interpret the result (L2*) as saying the 'infinitesimal draws' fromthe limiting rescaled distribution delta Z are 'IID infinitesimal Gaussian draws'. So how does (L2*) get interpreted that way as Gaussian? I see nothing particularly Gaussian about (L2*) unless you impose the Levy characterization. In fact, if delta Z has any t-dependence in the infinitessimal limit, then I don't see how the higher moments could be independent of t (as they would be in a Gaussian).QuoteBut this result was never assumed. It is a _consequence_ of relations that you have already agreed to.Since I am done, feel free to have the last word. I'm sure we'll re-hash the argument in a couple years. I am not interested in having the last word, nor in arguing with you. My only desire here is to understand what the properties of a non-Gaussian delta Z in the limit deltaT->0 are that enable Fokker-Planck. I'd rather not wait another two years. You wrote earlier "you could have a sequence of discrete-time random walks, where each step is drawn from some weird density p(.) with continuous support. If these sequences tends to a limit, as DeltaT->0, where the first two moments match the moment relation I posted, then the result is the same Fokker-Planck equation". I took this as an acknowledgement that a Gaussian delta Z in the limit DeltaT->0 was not required for FPE. Was I wrong? If not and it is not the t-dependence of the distribution (which disappears in the steady-state limit where it does indeed become Gaussian) that enables FPE in the non-Gaussian case, then where is it?

GZIP: On