Forward-Mode Automatic Differentiation with Dual Numbers

Recall the complex numbers written as \(a+bi\) where \(i^2 = -1\). It would be interesting to change this definition slightly, considering instead numbers of the form \(a+b\epsilon\) where \(\epsilon^2 = 0\). The \(\epsilon\) is suggestive an infinitesimal displacement. Such numbers are known as dual numbers. Sometimes we may write \((a,b)\) in place of \(a+b\epsilon\).

Notice that $$ (a+b\epsilon) + (c + d\epsilon) = (a+c) + (b+d)\epsilon $$ and $$ (a+b\epsilon)(c + d\epsilon) = ac + ad\epsilon + bc\epsilon + bd\epsilon^2 = ac + (ad + bc)\epsilon $$

It turns out that dual numbers can be used to compute derivatives. Consider \((a,b)(c,d) + (e,f)\). $$ (a,b)(c,d) + (e,f) = (ac + e, ad + bc + f) $$ Notice that \(\frac{\partial}{\partial a} ac + e = c\). Look what happens when we set \(b=1, d=0, f=0\): $$ (a,1)(c,0) + (e,0) = (ac + e, c) $$ Furthermore, we find that \(\frac{\partial}{\partial c} ac + e = a\). When we set \(b=0, d=1, f=0\) we get: $$ (a,0)(c,1) + (e,0) = (ac + e, a) $$ Finally, \(\frac{\partial}{\partial e} ac + e = 1\). Setting \(b=0, d=0, f=1\) gives: $$ (a,0)(c,0) + (e,1) = (ac + e, 1) $$

Dual numbers are particularly easy to implement in software. In Rust, for instance, we may write something like


struct DualNumber {
    a: f64,
    b: f64
}
                    
impl std::ops::Mul for DualNumber {
    type Output = DualNumber;

    fn mul(self, rhs: Self) -> Self::Output {
        DualNumber {
            a: self.a * rhs.a,
            b: (self.a * rhs.b) + (self.b * rhs.a)
        }
    }
                
It is straightforward to implement the remaining arithmetic operators.

Computing derivatives in this way is an example of forward-mode automatic differentiation. This is different than approximating derivatives using a finite difference as I did with the Laplacian in Gray-Scott. In this method, we have an exact expression of the derivative of a primitive operation and compute a numerical value. This eliminates round-off errors introduced in the discretization process.

I've uploaded the code for a simple dual numbers library here.