Video decoding in GStreamer with Vulkan Video extension (part 2)

Its has been a while since I reported my tinkering with the Vulkan Video provisional extension. Now the specification will have its final release soonish, and also there has been more engagement within the open source communities, such as the work-in-progress FFMpeg implementation by Lynne (please, please, read that post), and the also work-in-progress Mesa 3D drivers both for AMD and Intel by Dave Airlie! Along with the well known NVIDIA beta drivers for Vulkan.

From our side, we have been trying to provide an open source alternative to the video parser used by the Conformance Test Suite and the NVIDIA
vk_video_samples, using GStreamer: GstVkVideoParser, which intends to be a drop-in replacement of the current proprietary parser library.

Along the way, we have sketched the Vulkan Video support in
gfxreconstruct
, for getting traces of the API usage. Sadly, its kind of bit-rotten right now, even more because the specification has changed since then.

Regarding the H.264 decoder for GStreamer, we just restarted its hacking. The merge request was moved to monorepo, but for the sake of the well needed complete re-write, we changed the branch to this one (vkh264dec). We needed to re-write it because, besides the specification updates, we have learned many things along the journey, such as the out-of-band parameters update, Vulkan’s recommendation for memory pre-allocation as much as possible, the DPB/references handling, the debate about buffer vs. slice uploading, and other friction points that Lynne has spotted for future early adopters.

The way to compile it is grab the branch and compile as usually GStreamer is compiled with meson:

meson setup builddir -Dgst-plugins-bad:vulkan-video=enabled --buildtype=debug
ninja C builddir

And run simple pipelines such as

gst-launch-1.0 filesrc location=INPUT ! parsebin ! vulkanh264dec ! fakesink -v

Our objective is to have a functional demo for the next Vulkanised in
February
. We are very ambitious, we want it to work in Linux, Windows and in many GPU as possible. Wish us luck. And happy December festivities!

Gamepad in WPEWebkit

This is the brief story of the Gamepad implementation in WPEWebKit.

It started with an early development done by Eugene Mutavchi (kudos!). Later, by the end of 2021, I retook those patches and dicussed them with my fellow igalian Adrián, and we decided to come with a slightly different approach.

Before going into the details, let’s quickly review the WPE architecture:

  1. cog library — it’s a shell library that simplifies the task of writing a WPE browser from the scratch, by providing common functionality and helper APIs.
  2. WebKit library — that’s the web engine that, given an URI and other following inputs, returns, among other ouputs, graphic buffers with the page rendered.
  3. WPE library — it’s the API that bridges cog (1) (or whatever other browser application) and WebKit (2).
  4. WPE backend — it’s main duty is to provide graphic buffers to WebKit, buffers supported by the hardware, the operating system, windowing system, etc.

Eugene’s implementation has code in WebKit (implementing the gamepad support for WPE port); code in WPE library with an API to communicate WebKit’s gamepad and WPE backend, which provided a custom implementation of gamepad, reading directly the event in the Linux device. Almost everything was there, but there were some issues:

  • WPE backend is mainly designed as a set of protocols, similar to Wayland, to deal with graphic buffers or audio buffers, but not for input events. Cog library is the place where input events are handled and injected to WebKit, such as keyboard.
  • The gamepad handling in a WPE backend was ad-hoc and low level, reading directly the events from Linux devices. This approach is problematic since there are plenty gamepads in the market and each has its own axis and buttons, so remapping them to the standard map is required. To overcome this issue and many others, there’s a GNOME library: libmanette, which is already used by WebKitGTK port.

Today’s status of the gamepad support is that it works but it’s not yet fully upstreamed.

  • merged libwpe pull request.
  • cog pull request — there are two implementations: none and libmanette. None is just a dummy implementation which will ignore any request for a gamepad provider; it’s provided if libmanette is not available or if available libwpe hasn’t gamepad support.
  • WebKit pull request.

To prove you all that it works my exhibit A is this video, where I play asteroids in a RasberryPi 4 64 bits:

The image was done with buildroot, using its master branch (from a week ago) with a bunch of modifications, such as adding libmanette, a kernel patch for my gamepad device, kernel 5.15.55 and its corresponding firmware, etc.

GstVA H.264 encoder, compositor and JPEG decoder

There are, right now, three new GstVA elements merged in main: vah264enc, vacompositor and vajpegdec.

Just to recap, GstVA is a GStreamer plugin in gst-plugins-bad (yes, we agree it’s not a great name anymore), to differentiate it from gstreamer-vaapi. Both plugins use libva to access stateless video processing operations; the main difference is, precisely, how stream’s state is handled: while GstVA uses GStreamer libraries shared by other hardware accelerated plugins (such as d3d11 and v4l2codecs), gstreamer-vaapi uses an internal, tightly coupled and convoluted library.

Also, note that right now (release 1.20) GstVA elements are ranked NONE, while gstreamer-vaapi ones are mostly PRIMARY+1.

Back to the three new elements in GstVA, the most complex one is vah264enc wrote almost completely by He Junyan, from Intel. For it, He had to write a H.264 bitwriter which is, until certain extend, the opposite for H.264 parser: construct the bitstream buffer from H.264 structures such as PPS, SPS, slice header, etc. This API is part of libgstcodecparsers, ready to be reused by other plugins or applications. Currently vah264enc is fairly complete and functional, dealing with profiles and rate controls, among other parameters. It still have rough spots, but we’re working on them. But He Junyan is restless and he already has in the pipeline an encoder common class along with a HEVC and AV1 encoders.

The second element is vacompositor, wrote by Artie Eoff. It’s the replacement of vaapioverlay in gstreamer-vaapi. The suffix compositor is preferred to follow the name of primary video mixing (software-based) element: compositor, successor of videomixer. See this discussion for further details. The purpose of this element is to compose a single video stream from multiple video streams. It works with Intel’s media-driver supporting alpha channel, and also works with AMD Mesa Gallium, but without alpha channel (in other words, a custom degree of transparency).

The last, but not the least, element is vajpegdec, which I worked on. The main issue was not the decoder itself, but jpegparse, which didn’t signal the image caps required for the hardware accelerated decoders. For instance, VA only decodes images with SOF marker 0 (Baseline DCT). It wasn’t needed before because the main and only consumer of the parser is jpegdec which deals with any type of JPEG image. Long story short, we revamped jpegparse and now it signals sof marker, color space (YUV, RGB, etc.) and chroma subsampling (if it has YUV color space), along with comments and EXIF-like metadata as pipeline’s tags. Thus vajpegdec will expose in caps template the supported color space and chroma subsampling supported by the driver. For example, Intel supports (more or less) RGB color space, while AMD Mesa Gallium don’t.

And that’s all for now. Thanks.

From gst-build to local-projects

Two years ago I wrote a blog post about using gst-build inside of WebKit SDK flatpak. Well, all that has changed. That’s the true upstream spirit.

There were two main reason for the change:

  1. Since the switch to GStreamer mono repository, gst-build has been deprecated. The mechanism in WebKit were added, basically, to allow GStreamer upstream, so keeping gst-build directory just polluted the conceptual framework.
  2. By using gst-build one could override almost any other package in WebKit SDK. For example, for developing gamepad handling in WPE I added libmanette as a GStreamer subproject, to link a modified version of the library rather than the one in flatpak. But that approach added an unneeded conceptual depth in tree.

In order to simplify these operations, by taking advantage of Meson’s subproject support directly, gst-build handling were removed and new mechanism was set in place: Local Dependencies. With local dependencies, you can add or override almost any dependency, while flatting the tree layout, by placing at the same level GStreamer and any other library. Of course, in order add dependencies, they must be built with meson.

For example, to override libsoup and GStreamer, just clone both repositories below of Tools/flatpak/local-projects/subprojects, and declare them in WEBKIT_LOCAL_DEPS environment variable:


$ export WEBKIT_SDK_LOCAL_DEPS=libsoup,gstreamer-full
$ export WEBKIT_SDK_LOCAL_DEPS_OPTIONS="-Dgstreamer-full:introspection=disabled -Dgst-plugins-good:soup=disabled"
$ build-webkit --wpe

Digging further into Flatpak with NVIDIA

As you may know the development environment used by WebKitGTK and WPE is based on Flatpak. I feel hacking software within Flatpak like teleoperating a rover in Mars, since I have to go through Flatpak commands to execute the commands I want to execute. The learning curve is steeper in exchange of a common development environment.

I started to work on another project where is required to use an NVIDIA GPU, without stopping to work on WebKitGTK/WPE. So I needed to use the card within Flatpak, and it’s well known that, currently, that setup is not available out-of-the-box. Furthermore, I have to use a very specific version of the graphics card drive for Vulkan.

This is the story of how I make it work.

My main reference is, of course, the blog post of my colleague TingPing: Using host Nvidia driver with Flatpak, besides flatpak’s NVIDIA GL runtime platform.

As TingPing explained, flatpak does not use host libraries, that’s why it might need runtimes and extensions for specific hardware setups, with the libraries for user-space, such as NVIDIA GL platform runtime. And it must have the same version as the one running in kernel.

NVIDIA GL platform extension is a small project which generates flatpak’s runtimes for every public NVIDIA driver. The interesting part is that those runtimes are not created in building time, but at install-time. When the user installs the runtime, a driver blob is downloaded from NVIDIA servers (see --extra-data in flatpak build-finish for reference), and a small program is executed, which extracts the embedded tarball in the blob, and from it, it extracts the required libraries. In a few words, initially, the runtime is composed only by a definition of the file to
download, and the small program that populates the flatpak’s filesystem at install-time.

The trick here, that took me a lot to be aware, is that this small program has to be statically compiled, since it has be executed regardless the available runtime.

This little program uses libarchive to extract the libraries from NVIDIA’s tarball, but libarchive is not available statically in any flatpak’s SDK. Furthermore, our use of libarchive will depend on libz and liblzma, both statically compile as well. Gladly, there’s only one, very old version, obsolete, of freedesktop SDK, which offers static versions of libz and liblzma: 1.6. And that’s why org.freedesktop.Platform.GL.nvidia demands that specific old version of the SDK. Then, the manifest of the extension contains basically the static compilation of libarchive and the static compilation of the next-to-be apply_extra.

Update: There’s a merge request to use current freedesktop SDK 21.08, which, basically, builds statically libz and liblzma, besides libarchive.

I needed to modify org.freedesktop.Platform.GL.nvidia sources a bit, since it, by default, consist in a big loop of downloading, hashing, templating a json manifest, and building, for every supported driver. But, as my case is just one custom driver, I don’t want to waste time in that loop. The hack to achieve it is fairly simple:

diff --git a/versions.sh b/versions.sh
index 8b72664..86686c0 100755
--- a/versions.sh
+++ b/versions.sh
@@ -15,4 +15,5 @@ TESLA_VERSIONS="450.142.00 450.119.04 450.51.06 450.51.05 440.118.02 440.95.01 4
# Probably never: https://ahayzen.com/direct/flathub_downloads_only_nvidia_runtimes.txt
UNSUPPORTED_VERSIONS="390.147 390.144 390.143 390.141 390.138 390.132 390.129 390.116 390.87 390.77 390.67 390.59 390.48 390.42 390.25 390.12 387.34 387.22 387.12 384.130 384.111 384.98 384.90 384.69 384.59 384.47 381.22 381.09 378.13 375.82 375.66 375.39 375.26 370.28 367.57"

-DRIVER_VERSIONS="$BETA_VERSIONS $VULKAN_VERSIONS $NEW_FEATURE_VERSIONS $PRODUCTION_VERSIONS $LEGACY_VERSIONS $TESLA_VERSIONS $UNSUPPORTED_VERSIONS"
+#DRIVER_VERSIONS="$BETA_VERSIONS $VULKAN_VERSIONS $NEW_FEATURE_VERSIONS $PRODUCTION_VERSIONS $LEGACY_VERSIONS $TESLA_VERSIONS $UNSUPPORTED_VERSIONS"
+DRIVER_VERSIONS="470.XX.XX"

But in order to make it work, it needs a file in data/ directory with the specification of the file to download, with the format: NAME:SHA256:DOWNLOAD-SIZE:INSTALL-SIZE:URL.

--- /dev/null
+++ b/data/nvidia-470.XX.XX-x86_64.data
@@ -0,0 +1 @@
+:34...checksum-sha264...:123456789::http://compu.home.arpa/NVIDIA/NVIDIA-Linux-x86_64-470.XX.XX.run

The last parameter is the URL where the driver shall be downloaded. In my case is a local server to ease the testing.

Long story short, the command to execute are:

To setup the building environment:

$ flatpak install org.freedesktop.Sdk//1.6 org.freedesktop.Platform//1.6

To build the flatpak repository and package:

$ make

The command will output a repo directory in the current one. There’s where the generated flatpak package is stored.

To install the local repository and the extension:

$ flatpak --user remote-add --no-gpg-verify nvidia-local repo
$ flatpak -v install nvidia-local org.freedesktop.Platform.GL.nvidia-470-XX-XX

To remove the obsolete SDK and platform once built:

$ flatpak uninstall org.freedesktop.Sdk//1.6 org.freedesktop.Platform//1.6

To remove the local repository and the extension if something went wrong:

$ flatpak -v uninstall org.freedesktop.Platform.GL.nvidia-470-62-15
$ flatpak --user remote-delete nvidia-local

One way to verify if the libraries are installed correctly and if they match with the driver running in the kernel’s host, is to install and run GreenWithEnvy:

$ flatpak install com.leinardi.gwe
$ flatpak run com.leinardi.gwe

If you want to install the driver in your WebKit development environment, you just need to set the environment variable FLATPAK_USER_DIR:

$ FLATPAK_USER_DIR=~/WebKit/WebKitBuild/UserFlatpak flatpak --user remote-add --no-gpg-verify nvidia-local repo
$ FLATPAK_USER_DIR=~/WebKit/WebKitBuild/UserFlatpak flatpak -v install nvidia-local org.freedesktop.Platform.GL.nvidia-470-XX-XX