W hile derivates are at the heart of deep learning training (backprop and all that jazz), we rarely think about computing them: arguably, the capabilities of modern ML packages when it comes to “automatic differentiation” (fancy term for “doing derivates for us”) is one of the reasons behind the explosive growth of the field.
When asked to ELI5 what a derivative is, everybody resorts to a variation on “what happens to y when you add a little bit of x” (thanks GPT for the demonstration):
Remarkably, there is no limit / delta / epsilon in this seductive explanation (as opposed to the standard textbook introduction): conditioned by years of training, you must surely be thinking that nothing this simple could ever be usable!
As it turns out, for 200 years following Newton and Leibniz, all calculus applications were based on the “little bit of x” idea. As modern math had better PR, the original calculus is now branded as non-standard (analysis), since we had to wait until the 1960s to have a coherent view of that “little bit”.
This is a brief introduction to supernatural, infinitesimal and dual numbers, with unexpected application to one of the core concepts of ML, automatic differentiation: Newton-and-Leibniz calculus, re-booted.