Suppose we’ve had a recent error with a Kubernetes cluster. As often happens with a problem in our systems, we noticed it first in terms of the visible error, which we could state as “Builds did not complete.” Now we want to trace backwards to figure out what happened. A common technique is the “Five Whys” popularized by Lean thinking. So we ask “Why did builds not complete” and we find “Kubernetes could not start the pod, and the operation timed out after 1 hour.”
We could certainly debate whether that’s a single “why” or two of them in one step, but that’s not the key topic right now. The main thing is that this is a straightforward statement about causality. “Pod no start” leads directly to “build no done.”
The next step in this analysis reveals that the pod would not start because a volume was full with too many files. Again, direct causality.