Panfrost, the open source driver for Arm Mali, now supports OpenGL ES 3.1 on both Midgard (Mali T760 and newer) and Bifrost (Mali G31, G52, G76) GPUs.

Open Source OpenGL ES 3.1 on Mali GPUs with Panfrost

submited by

Style Pass

2021-06-11 15:00:03

Panfrost, the open source driver for Arm Mali, now supports OpenGL ES 3.1 on both Midgard (Mali T760 and newer) and Bifrost (Mali G31, G52, G76) GPUs. OpenGL ES 3.1 adds a number of features on top of OpenGL ES 3.0, notably including compute shaders. While Panfrost has had limited support for compute shaders on Midgard for use in TensorFlow Lite, the latest work extends the support to more GPUs and adds complementary features required by the OpenGL ES 3.1 specification, like indirect draws and no-attachment framebuffers.

The new feature support represents the cumulative effort of multiple Collaborans -- Boris Brezillon, Italo Nicola, and myself -- in tandem with the wider Mesa community. The OpenGL driver has seen over 1000 commits since the beginning of 2021, including several hundred targeting OpenGL ES 3.1 features. Our focus is Mali G52, where we are passing essentially all drawElements Quality Program and Khronos conformance tests and are aiming to become formally conformant. Nevertheless, thanks to a unified driver, many new features on Bifrost trickle down to Midgard allowing the older architecture still in wide use to improve long after the vendor has dropped support. On Mali T860, we are passing about 99.5% of tests required for conformant OpenGL ES 3.1. That number can only grow thanks to Mesa's continuous integration running these tests for every merge request and preventing Panfrost regressions. With a Vulkan driver in the works, Panfrost's API support is looking good.

Since the last Panfrost update, we've added an instruction scheduler to the Bifrost compiler. To understand the motivation, recall the hardware design. The Bifrost instruction set pairs instructions into "tuples", one using the multipliers and the other using the adders. Up to 8 tuples are grouped into a "clause", a sequence of instructions with fixed latency that can execute back-to-back with no pipeline bubbles in the middle. The benefit to the hardware designers was that Bifrost's pipeline could be statically filled by the compiler, rather than adding logic in the hardware to dynamically dispatch instructions to different parts of the units like a superscalar chip. Unfortunately, that means the compiler becomes significantly more complicated, as it has to group instructions itself satisfying dozens of architectural invariants. If any condition fails to be met, the GPU will fault with an Invalid Instruction Encoding exception and abort execution. In Panfrost, we've approached this by formally modeling the constraints to produce a predicate (function returning a boolean) for whether a given instruction may be scheduled in a given position in the program. Then it's simple enough to schedule greedily by choosing instructions with this predicate according to some selection heuristic. Wait, selection heuristic?