SuperTuxKart Vulkan vs OpenGL
The latest SuperTuxKart release comes with an experimental Vulkan renderer and I was eager to check it out on my Raspbery Pi 4 and see how well it worked.
The short story is that while I have only tested a few tracks it seems to perform really well overall. In my tests, even with a debug build of Mesa I saw the FPS ranging from 60 to 110 depending on the track. I think the game might be able to produce more than 110 fps actually, since various tracks were able to reach exactly 110 fps I think the limiting factor here was the display.
I was then naturally interested in comparing this to the GL renderer and I was a bit surprised to see that, with the same settings, the GL renderer would be somewhere in the 8-20 fps range for the same tracks. The game was clearly hitting a very bad path in the GL driver so I had to fix that before I could make a fair comparison between both.
A perf session quickly pointed me to the issue: Mesa has code to transparently translate vertex attribute formats that are not natively supported to a supported format. While this is great for compatibility it is obviously going to be very slow. In particular, SuperTuxKart uses rgba16f and rg16f with some vertex buffers and Mesa was silently translating these to 32-bit counterparts because the GL driver was not advertising support for the 16-bit variants. The hardware does support 16-bit floating point vertex attributes though, so this was very easy to fix.
The Vulkan driver was exposing support for this already, which explains the dramatic difference in performance between both drivers. Indeed, with that change SuperTuxKart now plays smooth on OpenGL too, with framerates always above 30 fps and up to 110 fps depending on the track. We should probably have an option in Mesa to make this kind of under-the-hood compatibility translations more obvious to users so we can catch silly issues like this more easily.
With that said, even if GL is now a lot better, Vulkan is still ahead by quite a lot, producing 35-50% better framerate than OpenGL depending on the track, at least for the tracks that don’t hit the 110 fps mark, which as I said above, looks like it is a display maximum, at least with my setup.
During my presentation at XDC last year I mentioned Zink wasn’t supported on Raspberry Pi 4 any more due to feature requirements we could not fulfill.
In the past, Zink used to abort when it detected unsupported features, but it seems this policy has been changed and now it simply drops a warning and points to the possibility of incorrect rendering as a result.
Also, I have been talking to zmike about one of the features we could not support natively: scalarBlockLayout. Particularly, the issue with this is that we can’t make it work with vectors in all cases and the only alternative for us would be to scalarize everything through a lowering, which would probably have a performance impact. However, zmike confirmed that Zink is already doing this, so in practice we would not see vector load/stores from Zink, in which case it should work fine .
So with all that in mind, I did give Zink a go and indeed, I get the warning that we don’t support scalar block layouts (and some other feature I don’t remember now) but otherwise it mostly works. It is not as stable as the native driver and some things that work with the native driver don’t work with Zink at present, some examples I saw include the WebGL Aquarium demo in Chromium or SuperTuxKart.
As far as performance goes, it has been a huge leap from when I tested it maybe 2 years ago. With VkQuake3‘s OpenGL renderer performance with Zink used to be ~40% of the native OpenGL driver, but is now on par with it, even if not a tiny bit better, so kudos to zmike and all the other contributors to Zink for all the work they put into this over the last 2 years, it really shows.
With all that said, I didn’t do too much testing with Zink myself so if anyone here decides to give it a more thorough go, please let me know how it went in the comments.