Video decoding in GStreamer with Vulkan

Warning: Vulkan video is still work in progress, from specification to available drivers and applications. Do not use it for production software just yet.

Introduction

Vulkan is a cross-platform Application Programming Interface (API), backed by the Khronos Group, aimed at graphics developers for a wide range of different tasks. The interface is described by a common specification, and it is implemented by different drivers, usually provided by GPU vendors and Mesa.

One way to visualize Vulkan, at first glance, is like a low-level OpenGL API, but better described and easier to extend. Even more, it is possible to implement OpenGL on top of Vulkan. And, as far as I am told by my peers in Igalia, Vulkan drivers are easier and cleaner to implement than OpenGL ones.

A couple years ago, a technical specification group (TSG), inside the Vulkan Working Group, proposed the integration of hardware accelerated video compression and decompression into the Vulkan API. In April 2021 the formed Vulkan Video TSG published an introduction to the
specification
. Please, do not hesitate to read it. It’s quite good.

Matthew Waters worked on a GStreamer plugin using Vulkan, mainly for uploading, composing and rendering frames. Later, he developed a library mapping Vulkan objects to GStreamer. This work was key for what I am presenting here. In 2019, during the last GStreamer Conference, Matthew delivered a talk about his work. Make sure to watch it, it’s worth it.

Other key components for this effort were the base classes for decoders and the bitstream parsing libraries in GStreamer, jointly developed by Intel, Centricular, Collabora and Igalia. Both libraries allow using APIs for stateless video decoding and encoding within the GStreamer framework, such as Vulkan Video, VAAPI, D3D11, and so on.

When the graphics team in Igalia told us about the Vulkan Video TSG, we decided to explore the specification. Therefore, Igalia decided to sponsor part of my time to craft a GStreamer element to decode H.264 streams using these new Vulkan extensions.

Assumptions

As stated at the beginning of this text, this development has to be considered unstable and the APIs may change without further notice.

Right now, the only Vulkan driver that offers these extensions is the beta NVIDIA driver. You would need, at least, version 455.50.12 for Linux, but it would be better to grab the latest one. And, of course, I only tested this on Linux. I would like to thank NVIDIA for their Vk Video samples. Their test application drove my work.

Finally, this work assumes the use of the main development branch of GStreamer, because the base classes for decoders are quite recent. Naturally, you can use gst-build for an efficient upstream workflow.

Work done

This work basically consists of two new objects inside the GstVulkan code:

  • GstVulkanDeviceDecoder: a GStreamer object in GstVulkan library, inherited from GstVulkanDevice, which enables VK_KHR_video_queue and VK_KHR_video_decode_queue extensions. Its purpose is to handle codec-agnostic operations.

  • vulkanh264dec: a GStreamer element, inherited from GstH264Decoder, which tries to instantiate a GstVulkanDeviceDecoder to composite it and is in charge of handling codec-specific operations later, such as matching the parsed structures. It outputs, in the source pad, memory:VulkanImage featured frames, with NV12 color format.

So far this pipeline works without errors:

$ gst-launch-1.0 filesrc location=big_buck_bunny_1080p_h264.mov ! parsebin ! vulkanh264dec ! fakesink

As you might see, the pipeline does not use vulkansink to render frames. This is because the Vulkan format output by the driver’s decoder device is VK_FORMAT_G8_B8R8_2PLANE_420_UNORM, which is NV12 crammed in a single image, while for GstVulkan a NV12 frame is a buffer with two images, one per component. So the current color conversion in GstVulkan does not support this Vulkan format. That is future work, among other things.

You can find the merge request for this work in GStreamer’s Gitlab.

Future work

As was mentioned before, it is required to fully support VK_FORMAT_G8_B8R8_2PLANE_420_UNORM format in GstVulkan. That requires thinking about how to keep backwards compatibility. Later, an implementation of the sampler to convert this format to RGB will be needed, so that decoded frames can be rendered by vulkansink.

Also, before implementing any new feature, the code and its abstractions will need to be cleaned up, since currently the division between codec-specific and codec-agnostic code is not strict, and it must be fixed.

Another important cleanup task is to enhance the way the Vulkan headers are handled. Since the required headers files for video extensions are beta, they are not expected to be available in the system, so temporally I had to add the those headers as part of the GstVulkan library.

Then it will be possible to implement the H.265 decoder, since the NVIDIA driver also supports it.

Later on, it will be nice to start thinking about encoders. But this requires extending support for stateless encoders in GStreamer, something I want do to for the new VAAPI plugin too.

Thanks for bearing with me, and thanks to Igalia for sponsoring this work.

Leave a Reply

Your email address will not be published. Required fields are marked *