Jensen’s inequality is fundamental in many fields, including machine learning and statistics. For example, it is useful in the diffusion models paper for understanding the variational lower bound. In this post, I will give a simple geometric intuition for Jensen’s inequality.
A function is a convex function when the line segment joining any two points on the graph of the function lies above or on the graph. In the simplest term, a convex function is shaped like a cup \(\cup\) and a concave function is shaped like a cap \(\cap\) . If f is convex, then -f is concave.
An interactive visualization of the convex function: \(f(x)=0.15(x - 15)^2 + 15\) . We will use the same parabola during this post unless stated otherwise. You can use the slider to try different values of (\(\lambda_1\) , \(\lambda_2)\) , where \(\lambda_2=1-\lambda_1\) .
We have a line segment that connects two points on the parabola: \((x_1, f(x_1))\) and \((x_2, f(x_2))\) . We can sample any point along the line segment by \((\lambda_1 x_1 + \lambda_2 x_2, \lambda_1 f(x_1) + \lambda_2 f(x_2))\) . For example: