Being an engine programmer usually means being a bit of a jack of all trades. There’s always something weird going on and you have to be pretty familiar with a bunch of low level details that come in handy in unexpected ways. Recently I went down a somewhat unexpected rabbit hole where those skills came in extremely hand. In an effort to blog more and also because it seems like I was the first to run into this issue, I figured I should sit down and just write about it so future people can benefit from it.
We are currently working on porting the engine of X-Plane mobile from GLES to Vulkan. On Desktop we did the transition to Vulkan 4 years ago already, but mobile was always a harder target because the driver quality is just so much worse and updates to them are almost non-existent. But time marches forward and so, at long last, we are bringing Vulkan to X-Plane mobile in a project dubbed Vandroid. One issue we ran into was that calling vkEndCommandBuffer() would segfault on Adreno devices. In particular, it would segfault on a Samsung A52 with Adreno 530 running the latest version of Android. The backtrace looks like this:
The validation layers were happy and this code has been running in production on desktop environments for years. Being the resident Vulkanologist, this issue ended up on my plate. One of the first things I like to do with issues in existing code bases that suddenly fall apart, is to just reduce everything to the bare minimum. X-Planes Vulkan backend makes use of various extensions and features for performance reasons. Things like push descriptors, descriptor update templates (I know, very fancy), extended dynamic state, that sort of thing. So the obvious first step was to just disable every single optional feature and extension and falling back to the most basic Vulkan code base. This changed nothing. However, bypassing every draw call allowed me to see the clear colour and it would no longer crash. So, narrowed it down to draw calls. I felt overconfident that this was going to be easy.