Forward-Mode Automatic Differentiation with Dual Numbers
Recall the complex numbers written as \(a+bi\) where \(i^2 = -1\). It would be interesting to change this definition slightly, considering instead numbers of the form \(a+b\epsilon\) where \(\epsilon^2 = 0\). The \(\epsilon\) is suggestive an infinitesimal displacement. Such numbers are known as dual numbers. Sometimes we may write \((a,b)\) in place of \(a+b\epsilon\).
Notice that $$ (a+b\epsilon) + (c + d\epsilon) = (a+c) + (b+d)\epsilon $$ and $$ (a+b\epsilon)(c + d\epsilon) = ac + ad\epsilon + bc\epsilon + bd\epsilon^2 = ac + (ad + bc)\epsilon $$
It turns out that dual numbers can be used to compute derivatives. Consider \((a,b)(c,d) + (e,f)\). $$ (a,b)(c,d) + (e,f) = (ac + e, ad + bc + f) $$ Notice that \(\frac{\partial}{\partial a} ac + e = c\). Look what happens when we set \(b=1, d=0, f=0\): $$ (a,1)(c,0) + (e,0) = (ac + e, c) $$ Furthermore, we find that \(\frac{\partial}{\partial c} ac + e = a\). When we set \(b=0, d=1, f=0\) we get: $$ (a,0)(c,1) + (e,0) = (ac + e, a) $$ Finally, \(\frac{\partial}{\partial e} ac + e = 1\). Setting \(b=0, d=0, f=1\) gives: $$ (a,0)(c,0) + (e,1) = (ac + e, 1) $$
Dual numbers are particularly easy to implement in software. In Rust, for instance, we may write something like
struct DualNumber {
a: f64,
b: f64
}
impl std::ops::Mul for DualNumber {
type Output = DualNumber;
fn mul(self, rhs: Self) -> Self::Output {
DualNumber {
a: self.a * rhs.a,
b: (self.a * rhs.b) + (self.b * rhs.a)
}
}
It is straightforward to implement the remaining arithmetic operators.
Computing derivatives in this way is an example of forward-mode automatic differentiation. This is different than approximating derivatives using a finite difference as I did with the Laplacian in Gray-Scott. In this method, we have an exact expression of the derivative of a primitive operation and compute a numerical value. This eliminates round-off errors introduced in the discretization process.
I've uploaded the code for a simple dual numbers library here.