Raspberry Pi 4 V3D driver gets OpenGL ES 3.1 conformance

So continuing with the news, here is a fairly recent one: as the tile states, I am happy to announce that the Raspberry Pi 4 is now an OpenGL ES 3.1 conformant product!. This means that the Mesa V3D driver has successfully passed a whole lot of tests designed to validate the OpenGL ES 3.1 feature set, which should be a good sign of driver quality and correctness.

It should be noted that the Raspberry Pi 4 shipped with a V3D driver exposing OpenGL ES 3.0, so this also means that on top of all the bugfixes that we implemented for conformance, the driver has also gained new functionality! Particularly, we merged Eric’s previous work to enable Compute Shaders.

All this work has been in Mesa master since December (I believe there is only one fix missing waiting for us to address review feedback), and will hopefully make it to Raspberry Pi 4 users soon.

11 comments

    • I think 19.3 should have all the CTS fixes (except for the one that is pending addressing review feedback). Geometry shaders should come with 20.0.

      • Thanks for the informations. Great work, I was able to get Tensorflow Lite GPU delegate working (https://www.tensorflow.org/lite/performance/gpu, requires OpenGL ES3.1 ;-)). At least with Mesa 19.3.1, the TFLite GPU delegate on RPi4 is about 3-4 times slower than CPU with 4 Threads. With V3D_DEBUG=perf, I see a comment that currently just 1 WG (work group?) per SG (?) is enabled. Can you give some information for this?

        I you want to play around with TF Lite, check https://github.com/jsee23/tensorflow/tree/rpi4. For building, go to tensorflow/lite/tools/make and run ./download_dependencies.sh. Afterwards, switch to tensorflow/lite/tools/cmake and run ./build.sh -DCMAKE_TOOLCHAIN_FILE= -D TFLITE_DELEGATE_GL=1 -DTFLITE_DELEGATE_GL_GBM=1.
        This will build a “tflite-benchmark” app, copy it to the target and run it with a tflite-model like “./tflite-benchmark –graph=.tflite –use_gpu=true –num_runs=10

        • The driver is hardcoding a setup where it only uses a single workgroup (WG) per supergroup (SG). It is likely that this number can be increased for more parallelism and therefore better performance in some cases and it is something we should look into at some point.

  1. Very nice, fantastic work!

    Is there a mailing list for development discussion?

    Is there a way to create an offscreen context on raspberry pi 4?