SERVING THE QUANTITATIVE FINANCE COMMUNITY

snufkin
Topic Author
Posts: 64
Joined: January 25th, 2017, 9:05 am
Location: Cambridge

### AAD in practice

Hi all,

A newbie question: is AAD (adjoint algorithmic differentiation) actually used in practice? What is it good for and what are its limitations? Some of the quants say that it's only good for first derivatives; I can see a brilliant description at https://www.wilmott.com/automatic-for-the-greeks/ — but then I was under impression that CCR is not actually used?

Best regards,
Last edited by snufkin on April 21st, 2017, 9:57 pm, edited 1 time in total.

snufkin
Topic Author
Posts: 64
Joined: January 25th, 2017, 9:05 am
Location: Cambridge

### Re: AAD in practice

It seems the question was raised before, in 2011: viewtopic.php?f=34&t=85423&p=647774&hilit=AAD#p647774 — no conclusion though, and the referred webinar is gone...
Last edited by snufkin on April 21st, 2017, 9:58 pm, edited 1 time in total.

snufkin
Topic Author
Posts: 64
Joined: January 25th, 2017, 9:05 am
Location: Cambridge

### Re: AAD in practice

Cuchulainn
Posts: 62881
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: AAD in practice

There was a discussion here

At the time I tried lightly reading some of the articles but I did not understand much,  to be honest.
Step over the gap, not into it. Watch the space between platform and train.
http://www.datasimfinancial.com
http://www.datasim.nl

mtsm
Posts: 352
Joined: July 28th, 2010, 1:40 pm

### Re: AAD in practice

Yes, it is used extremely heavily in ML to perform optimization by gradient descent. There are a lot of ML packages that implement this de facto. Just look at any of the packages released by the big tech firms. It's all about it.

It's also used for various risk calculations in some global IBs.

Cuchulainn
Posts: 62881
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: AAD in practice

Yes, it is used extremely heavily in ML to perform optimization by gradient descent. There are a lot of ML packages that implement this de facto. Just look at any of the packages released by the big tech firms. It's all about it.

It's also used for various risk calculations in some global IBs.
It is used to compute gradients and Hessian, that kind of area?

It is possible to understand AAD by a simple 101 example or does one need to have certain background knowledge?
Step over the gap, not into it. Watch the space between platform and train.
http://www.datasimfinancial.com
http://www.datasim.nl

outrun
Posts: 4573
Joined: April 29th, 2016, 1:40 pm

### Re: AAD in practice

Yes, it is used extremely heavily in ML to perform optimization by gradient descent. There are a lot of ML packages that implement this de facto. Just look at any of the packages released by the big tech firms. It's all about it.

It's also used for various risk calculations in some global IBs.
It is used to compute gradients and Hessian, that kind of area?

It is possible to understand AAD by a simple 101 example or does one need to have certain background knowledge?
Yes, exactly, for gradient. In ML frameworks like tensorflow you specify a graph of computations -like excel does- with some end result that's typically a cost function (lsquares errors, likelihood, entropy) and then it automatically computer the gradient throughout the whole dependency tree and allows you to search for a minimal cost.

Eg
https://stats.stackexchange.com/questio ... tensorflow

snufkin
Topic Author
Posts: 64
Joined: January 25th, 2017, 9:05 am
Location: Cambridge

### Re: AAD in practice

It is possible to understand AAD by a simple 101 example or does one need to have certain background knowledge?
Cuch, for me the eye-opener was this article: https://www.wilmott.com/automatic-for-the-greeks/ — it explains the basic idea and shows the application, which is quite impressive (given how simple the basic idea is!)
Last edited by snufkin on April 21st, 2017, 9:58 pm, edited 1 time in total.

snufkin
Topic Author
Posts: 64
Joined: January 25th, 2017, 9:05 am
Location: Cambridge

### Re: AAD in practice

It is possible to understand AAD by a simple 101 example or does one need to have certain background knowledge?
The core idea is as follows:

Dual numbers are an extension of the real numbers, similar to complex numbers, except that instead of an imaginary unit i with the property $i^2 = -1$, we have an infinitesimal unit $\varepsilon$ with the property $\varepsilon^2 = 0$. The coefficient of $\varepsilon$ is the gradient with respect to $x$; this is initially 1 since $dx/dx\ =\ 1$
Since most of the transformations you use in numerical methods are linear, you... get the actual derivative propagated alongside the value — voila.

Moreover, if you redefine the operations to support differentiation, you can work with more complicated models, too: as long as you know what's the derivative of the result of an operation in terms of values and derivatives of the operands, you're fine. E.g. $(x + x'\varepsilon) \times (y + y'\varepsilon) = xy + (xy' + x'y)\varepsilon$