Video decoding in GStreamer with Vulkan
Warning: Vulkan video is still work in progress, from specification to available drivers and applications. Do not use it for production software just yet.## Introduction
Vulkan is a cross-platform Application Programming Interface (API), backed by the Khronos Group, aimed at graphics developers for a wide range of different tasks. The interface is described by a common specification, and it is implemented by different drivers, usually provided by GPU vendors and Mesa.
One way to visualize Vulkan, at first glance, is like a low-level OpenGL API, but better described and easier to extend. Even more, it is possible to implement OpenGL on top of Vulkan. And, as far as I am told by my peers in Igalia, Vulkan drivers are easier and cleaner to implement than OpenGL ones.
A couple years ago, a technical specification group (TSG), inside the Vulkan Working Group, proposed the integration of hardware accelerated video compression and decompression into the Vulkan API. In April 2021 the formed Vulkan Video TSG published an introduction to the specification. Please, do not hesitate to read it. It’s quite good.
Matthew Waters worked on a GStreamer plugin using Vulkan, mainly for uploading, composing and rendering frames. Later, he developed a library mapping Vulkan objects to GStreamer. This work was key for what I am presenting here. In 2019, during the last GStreamer Conference, Matthew delivered a talk about his work. Make sure to watch it, it’s worth it.
Other key components for this effort were the base classes for decoders and the bitstream parsing libraries in GStreamer, jointly developed by Intel, Centricular, Collabora and Igalia. Both libraries allow using APIs for stateless video decoding and encoding within the GStreamer framework, such as Vulkan Video, VAAPI, D3D11, and so on.
When the graphics team in Igalia told us about the Vulkan Video TSG, we decided to explore the specification. Therefore, Igalia decided to sponsor part of my time to craft a GStreamer element to decode H.264 streams using these new Vulkan extensions.
Assumptions #
As stated at the beginning of this text, this development has to be considered unstable and the APIs may change without further notice.
Right now, the only Vulkan driver that offers these extensions is the beta NVIDIA driver. You would need, at least, version 455.50.12 for Linux, but it would be better to grab the latest one. And, of course, I only tested this on Linux. I would like to thank NVIDIA for their Vk Video samples. Their test application drove my work.
Finally, this work assumes the use of the main development branch of GStreamer, because the base classes for decoders are quite recent. Naturally, you can use gst-build for an efficient upstream workflow.
Work done #
This work basically consists of two new objects inside the GstVulkan code:
GstVulkanDeviceDecoder
: a GStreamer object inGstVulkan
library, inherited fromGstVulkanDevice
, which enablesVK_KHR_video_queue
andVK_KHR_video_decode_queue
extensions. Its purpose is to handle codec-agnostic operations.vulkanh264dec
: a GStreamer element, inherited fromGstH264Decoder
, which tries to instantiate aGstVulkanDeviceDecoder
to composite it and is in charge of handling codec-specific operations later, such as matching the parsed structures. It outputs, in the source pad,memory:VulkanImage
featured frames, with NV12 color format.
So far this pipeline works without errors:
gst-launch-1.0 filesrc location=big_buck_bunny_1080p_h264.mov ! parsebin ! vulkanh264dec ! fakesink
As you might see, the pipeline does not use vulkansink
to render frames. This
is because the Vulkan format output by the driver’s decoder device is
VK_FORMAT_G8_B8R8_2PLANE_420_UNORM
, which is NV12 crammed in a single image,
while for GstVulkan
a NV12 frame is a buffer with two images, one per
component. So the current color conversion in GstVulkan
does not support this
Vulkan format. That is future work, among other things.
You can find the merge request for this work in GStreamer’s Gitlab.
Future work #
As was mentioned before, it is required to fully support
VK_FORMAT_G8_B8R8_2PLANE_420_UNORM
format in GstVulkan
. That requires
thinking about how to keep backwards compatibility. Later, an implementation of
the sampler to convert this format to RGB will be needed, so that decoded frames
can be rendered by vulkansink
.
Also, before implementing any new feature, the code and its abstractions will need to be cleaned up, since currently the division between codec-specific and codec-agnostic code is not strict, and it must be fixed.
Another important cleanup task is to enhance the way the Vulkan headers are
handled. Since the required headers files for video extensions are beta, they
are not expected to be available in the system, so temporally I had to add the
those headers as part of the GstVulkan
library.
Then it will be possible to implement the H.265 decoder, since the NVIDIA driver also supports it.
Later on, it will be nice to start thinking about encoders. But this requires extending support for stateless encoders in GStreamer, something I want do to for the new VAAPI plugin too.
Thanks for bearing with me, and thanks to Igalia for sponsoring this work.