Pierre Blanchard, Desmond J Higham, Nicholas J Higham, Accurately computing the log-sum-exp and softmax functions, IMA Journal of Numerical Analysis, Volume 41, Issue 4, October 2021, Pages 2311–2330, https://doi.org/10.1093/imanum/draa038
Evaluating the log-sum-exp function or the softmax function is a key step in many modern data science algorithms, notably in inference and classification. Because of the exponentials that these functions contain, the evaluation is prone to overflow and underflow, especially in low-precision arithmetic. Software implementations commonly use alternative formulas that avoid overflow and reduce the chance of harmful underflow, employing a shift or another rewriting. Although mathematically equivalent, these variants behave differently in floating-point arithmetic and shifting can introduce subtractive cancellation. We give rounding error analyses of different evaluation algorithms and interpret the error bounds using condition numbers for the functions. We conclude, based on the analysis and numerical experiments, that the shifted formulas are of similar accuracy to the unshifted ones, so can safely be used, but that a division-free variant of softmax can suffer from loss of accuracy.
The most obvious danger in evaluating ( 1.1) and ( 1.2) is overflow. We are interested in IEEE arithmetic in the precisions half (fp16), single (fp32) and double (fp64) ( IEEE, 2019), as well as the bfloat16 half-precision format ( Intel Corporation, 2018). Table 1 shows the key parameters of interest for these precisions: the unit roundoff |$u$| , the largest finite number |$r_{\max }$| and the smallest positive normalized and subnormal floating-point numbers. If some |$x_i$| exceeds the relevant |$\log r_{\max }$| value in Table 2 then overflow will occur in evaluating |${ {e}}^{x_i}$| . Clearly, overflow is possible even for quite modestly sized |$x$| , especially for half and single precision.