It is possible to understand AAD by a simple 101 example or does one need to have certain background knowledge?

The core idea is as follows:

Dual numbers are an extension of the real numbers, similar to complex numbers, except that instead of an imaginary unit i with the property [$]i^2 = -1[$], we have an infinitesimal unit [$]\varepsilon[$] with the property [$]\varepsilon^2 = 0[$]. The coefficient of [$]\varepsilon[$] is the gradient with respect to [$]x[$]; this is initially 1 since [$]dx/dx\ =\ 1[$]

Since most of the transformations you use in numerical methods are linear, you... get the actual derivative propagated alongside the value — voila.

Moreover, if you redefine the operations to support differentiation, you can work with more complicated models, too: as long as you know what's the derivative of the result of an operation in terms of values and derivatives of the operands, you're fine. E.g. \[ (x + x'\varepsilon) \times (y + y'\varepsilon) = xy + (xy' + x'y)\varepsilon \]