This year the GStreamer Conference happened in A Coruña, basically at home, along with the hackfest.
The conference was the first after a long hiatus of four years of pandemics. The community craved it and long expected it. Some igalians helped to the GStreamer Foundation and our warm community with the organization and logistics. I’m very thankful with my peers and the sponsors of the event. Personally, I’m happy with the outcome. Though, I ought to say, organizing a conference like this is quite a challenge and very demanding.
The conference were recorded and streamed by Ubicast. And you can watch any presentation of the conference in their GStreamer Archive.
This is the list of talks where fellow Igalians participated:
There were two days of conference. The following two were for the hackfest, at Igalia’s Head Quarters.
It took almost a year of design and implementation but finally the DMABuf modifier negotiation in GStreamer is merged. Big kudos to all the people involved but mostly to He Junyan, who did the vast majority of the code.
What’s a DMAbuf modifier?
DMABuf are the Linux kernel mechanism to share buffers among different drivers or subsystems. A particular case of DMABuf are the DRM PRIME buffers which are buffers shared by the Display Rendering Manager (DRM) subsystem. They allowed sharing video frames between devices with zero copy.
When we initially added support for DMABuf in GStreamer, we assumed that only color format and size mattered, just as old video frames stored in system memory. But we were wrong. Beside color format and size, also the memory layout has to be considered when sharing DMABufs. By not considering it, the produced output had horrible tiled artifacts in screen. This memory layout is known as modifier, and it’s uniquely described by an uint64 number.
How was designed and implemented?
First, we wrote a design document for caps negotiation with dmabuf
modifiers, where we added a new color format (DMA_DRM) and a new caps field (drm-format). This new caps field holds a string, or a list of strings, composed by the tuple DRM_color_format : DRM_modifier.
Second, we extended the video info object to support DMABuf with helper functions that parse and construct the drm-format field.
Third, we added the dmabuf caps negotiation in glupload. This part was the most difficult one, since the capability of importing DMABufs to OpenGL (which is only available in EGL/GLES) is run-time defined, by querying the hardware. Also, there are two code paths to import frames: direct or RGB-emulated. Direct would be the most efficient, but it depends on the presence of GLES2 API in the driver; while RGB-emulated is imported as a set of RGB images where each component is an image. At the end more than a thousand lines of code were added to the glupload element, beside the code added to EGL context object.
Fourth, and unexpectedly, waylandsink also got DMABuf caps negotiation
support.
And lastly, decoders in `va** plugin merged theirs DMABuf caps negotiation support.
How I can test it?
You need, of course, to user the current main branch of GStreamer, since it’s just fresh and there’s no release yet. Then you need a box with VA support. And if you inspect, for example, vah264dec, you might see this output if your box is Intel (but also AMD through Mesa is supported though the negotiated memory is linear so far):
Long story short, last year we started to support Media Capture and Streams in WebKitGTK and WPE using GStreamer, either for input devices (camera and microphone), desktop sharing, webaudio, and web canvas. But this is just the first step. We are currently working on RTCPeerConnection, also using GStreamer, to share all these captured streams with other web peers. Meanwhile, we’ll wait for the second episode of Phil’s series 🙂
MediaRecorder
We worked in an initial implementation of MediaRecorder with GStreamer (1.20 or superior). The specification goes about allowing a web browser to record a selected stream. For example, a voice-memo or video application which could encode and upload a capture of your microphone / camera.
Gamepad
While WebKitGTK already has Gamepad support, WPE lacked it. We did the implementation last year, and there’s a blog post about it: Gamepad in WPEWebkit, with video showing a demo of it.
Capture encoded video streams from webcams
Some webcams only provide high resolution frames encoded in H.264 or so. In order to support these resolutions with those webcams we added the support for negotiate of those formats and decode them internally to handle the streams. Though we are just at the beginning of more efficient support.
Flatpak SDK maintenance
A lot of effort went to maintain the Flatpak SDK for WebKit. It is a set of runtimes that allows to have a reproducible build of WebKit, independently of the used Linux distribution. Nowadays the Flatpak SDK is used in Webkit’s EWS, and by many developers.
Among all the features added during the year we can highlight added Rust support, a full integrity check before upgrading, and offer a way to override dependencies as local projects.
MSE/EME enhancements
As every year, massive work was done in WebKit ports using GStreamer for Media Source Extensions and Encrypted Media Extensions, improving user experience with different streaming services in the Web, such as Odysee, Amazon, DAZN, etc.
In the case of encrypted media, GStreamer-based WebKit ports provide the stubs to communicate with an external Content Decryption Module (CDM). If you’re willing to support this in your platform, you can reach us.
Also we worked in a video demo showing how MSE/EME works in a Raspberry Pi 3 using WPE:
WebAudio demo
We also spent time recording video demos, such as this one, showing WebAudio using WPE on a desktop computer.
GStreamer
We managed to merge a lot of bug fixes in GStreamer, which in many cases can be harder to solve rather than implementing new features, though former are more interesting to tell, such as those related with making Rust the main developing language for GStreamer besides C.
Rust bindings and GStreamer elements for Vonage Video API / OpenTok
OpenTok is the legacy name of Vonage Video API, and is a PaaS (Platform As a Service) to ease the development and deployment of WebRTC services and applications.
In the beginning there was webrtcbin, an element that implements the majority of W3C RTCPeerConnection API. It’s so flexible and powerful that it’s rather hard to use for the most common cases. Then appeared webrtcsink, a wrapper of webrtcbin, written in Rust, which receives GStreamer streams which will be offered and streamed to web peers. Later on, we developed webrtcsrc, the webrtcsink counterpart: an element which source pads push streams from web peers, such as another browser, and forward those Web streams as GStreamer ones in a pipeline. Both webrtcsink and webrtcsrc are written in Rust.
Behavior-Driven Development test framework for GStreamer
Behavior-Driven Development is gaining relevance with tools like Cucumber for Java and its domain specific language, Gherkin to define software behaviors. Rustaceans have picked up these ideas and developed cucumber-rs. The logical consequence was obvious: Why not GStreamer?
Last year we tinkered with GStreamer-Cucumber, a BDD to define behavior tests for GStreamer pipelines.
GstValidate Rust bindings
There have been some discussion if BDD is the best way to test GStreamer pipelines, and there’s GstValidate, and also, last year, we added its Rust bindings.
GStreamer Editing Services
Though not everything was Rust. We work hard on GStreamer’s nuts and bolts.
Last year, we gathered the team to hack GStreamer Editing Services, particularly to explore adding OpenGL and DMABuf support, such as downloading or uploading a texture before processing, and selecting a proper filter to avoid those transfers.
GstVA and GStreamer-VAAPI
We helped in the maintenance of GStreamer-VAAPI and the development of its near replacement: GstVA, adding new elements such as the H.264 encoder, the compositor and the JPEG decoder. Along with participation on the debate and code reviewing of negotiating DMABuf streams in the pipeline.
Vulkan decoder and parser library for CTS
You might have heard about Vulkan has now integrated in its API video decoding, while encoding is currently work-in-progress. We devoted time on helping Khronos with the Vulkan Video Conformance Tests (CTS), particularly with a parser based on GStreamer and developing a H.264 decoder in GStreamer using Vulkan Video API.
You can check the presentation we did last Vulkanised.
WPE Android Experiment
In a joint adventure with Igalia’s Webkit team we did some experiments to port WPE to Android. This is just an internal proof of concept so far, but we are looking forward to see how this will evolve in the future, and what new possibilities this might open up.
If you have any questions about WebKit, GStreamer, Linux video stack, compilers, etc., please contact us.
Suppose that you have to hack a GStreamer element which requires a library that is not (yet) packaged by your distribution, nor wrapped as a Meson’s subproject. How do you do?
For these cases, GStreamer’s uninstalled development scripts can use a special directory: gstreamer/prefix. As the README.md says:
NOTE: In the development environment, a fully usable prefix is also configured in gstreamer/prefix where you can install any extra dependency/project.
This means that gstenv.py script (the responsible of setting up the uninstalled development environment) will add
gstreamer/prefix/bin in PATH for executable files.
gstreamer/prefix/lib and gstreamer/prefix/share/gstreamer-1.0 in GST_PLUGIN_PATH, for out-of-tree elements.
gstreamer/prefix/lib in GI_TYPELIB_PATH for GObject Introspection metadata.
gstreamer/prefix/lib/pkgconfig in PKG_CONFIG_PATH for third party dependencies (our case!)
gstreamer/prefix/etc/xdg for XDG_CONFIG_DIRS for XDG compliant configuration files.
gstreamer/prefix/lib and gstreamer/prefix/lib64 in LD_LIBRARY_PATH for third party libraries.
Therefore, the general idea, is to compile those third party libraries with their installation prefix as gstreamer/prefix.
In our case, Vulkan repositories are interrelated so they need to be compiled in certain order. Also, we decided, for self-containment, to clone them in gstreamer/subprojects.
Vulkan-Headers
$ cd ~/gst/gstreamer/subprojects
$ git clone git@github.com:KhronosGroup/Vulkan-Headers.git
$ cd Vulkan-Headers
$ mkdir build
$ cd build
$ cmake -GNinja -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=/home/vjaquez/gst/gstreamer/prefix ..
$ cmake --build . --install
Vulkan-Loader
$ cd ~/gst/gstreamer/subprojects
$ git clone git@github.com:KhronosGroup/Vulkan-Loader.git
$ cd Vulkan-Loader
$ mkdir build
$ cd build
$ cmake -DCMAKE_BUILD_TYPE=Debug -DVULKAN_HEADERS_INSTALL_DIR=/home/vjaquez/gst/gstreamer/prefix DCMAKE_INSTALL_PREFIX=/home/vjaquez/gst/gstreamer/prefix ..
$ cmake --build . --install
Vulkan-Tools
$ cd ~/gst/gstreamer/subprojects
$ git clone git@github.com:KhronosGroup/Vulkan-Tools.git
$ cd Vulkan-Tools
$ mkdir build
$ cd build
$ cmake -DCMAKE_BUILD_TYPE=Debug -DVULKAN_HEADERS_INSTALL_DIR=/home/vjaquez/gst/gstreamer/prefix DCMAKE_INSTALL_PREFIX=/home/vjaquez/gst/gstreamer/prefix ..
$ cmake --build . --install
Right now we have the Vulkan headers and the Vulkan loader pkg-config file in place. And we should be able to compile GStreamer. Right?
Not exactly, because gstenv.py only sets the environment variables for the development environment, not for GStreamer compilation. But the solution is simple, because we have all set in the proper order: just to set PKG_CONFIG_PATH when executing meson setup:
Its has been a while since I reported my tinkering with the Vulkan Video provisional extension. Now the specification will have its final release soonish, and also there has been more engagement within the open source communities, such as the work-in-progressFFMpeg implementation by Lynne (please, please, read that post), and the also work-in-progress Mesa 3D drivers both for AMD and Intel by Dave Airlie! Along with the well known NVIDIA beta drivers for Vulkan.
From our side, we have been trying to provide an open source alternative to the video parser used by the Conformance Test Suite and the NVIDIA vk_video_samples, using GStreamer: GstVkVideoParser, which intends to be a drop-in replacement of the current proprietary parser library.
Along the way, we have sketched the Vulkan Video support in
gfxreconstruct, for getting traces of the API usage. Sadly, its kind of bit-rotten right now, even more because the specification has changed since then.
Regarding the H.264 decoder for GStreamer, we just restarted its hacking. The merge request was moved to monorepo, but for the sake of the well needed complete re-write, we changed the branch to this one (vkh264dec). We needed to re-write it because, besides the specification updates, we have learned many things along the journey, such as the out-of-band parameters update, Vulkan’s recommendation for memory pre-allocation as much as possible, the DPB/references handling, the debate about buffer vs. slice uploading, and other friction points that Lynne has spotted for future early adopters.
The way to compile it is grab the branch and compile as usually GStreamer is compiled with meson:
meson setup builddir -Dgst-plugins-bad:vulkan-video=enabled --buildtype=debug
ninja C builddir
Our objective is to have a functional demo for the next Vulkanised in
February. We are very ambitious, we want it to work in Linux, Windows and in many GPU as possible. Wish us luck. And happy December festivities!
It started with an early development done by Eugene Mutavchi (kudos!). Later, by the end of 2021, I retook those patches and dicussed them with my fellow igalian Adrián, and we decided to come with a slightly different approach.
Before going into the details, let’s quickly review the WPE architecture:
cog library — it’s a shell library that simplifies the task of writing a WPE browser from the scratch, by providing common functionality and helper APIs.
WebKit library — that’s the web engine that, given an URI and other following inputs, returns, among other ouputs, graphic buffers with the page rendered.
WPE library — it’s the API that bridges cog (1) (or whatever other browser application) and WebKit (2).
WPE backend — it’s main duty is to provide graphic buffers to WebKit, buffers supported by the hardware, the operating system, windowing system, etc.
Eugene’s implementation has code in WebKit (implementing the gamepad support for WPE port); code in WPE library with an API to communicate WebKit’s gamepad and WPE backend, which provided a custom implementation of gamepad, reading directly the event in the Linux device. Almost everything was there, but there were some issues:
WPE backend is mainly designed as a set of protocols, similar to Wayland, to deal with graphic buffers or audio buffers, but not for input events. Cog library is the place where input events are handled and injected to WebKit, such as keyboard.
The gamepad handling in a WPE backend was ad-hoc and low level, reading directly the events from Linux devices. This approach is problematic since there are plenty gamepads in the market and each has its own axis and buttons, so remapping them to the standard map is required. To overcome this issue and many others, there’s a GNOME library: libmanette, which is already used by WebKitGTK port.
Today’s status of the gamepad support is that it works but it’s not yet fully upstreamed.
cog pull request — there are two implementations: none and libmanette. None is just a dummy implementation which will ignore any request for a gamepad provider; it’s provided if libmanette is not available or if available libwpe hasn’t gamepad support.
To prove you all that it works my exhibit A is this video, where I play asteroids in a RasberryPi 4 64 bits:
The image was done with buildroot, using its master branch (from a week ago) with a bunch of modifications, such as adding libmanette, a kernel patch for my gamepad device, kernel 5.15.55 and its corresponding firmware, etc.
There are, right now, three new GstVA elements merged in main: vah264enc, vacompositor and vajpegdec.
Just to recap, GstVA is a GStreamer plugin in gst-plugins-bad (yes, we agree it’s not a great name anymore), to differentiate it from gstreamer-vaapi. Both plugins use libva to access stateless video processing operations; the main difference is, precisely, how stream’s state is handled: while GstVA uses GStreamer libraries shared by other hardware accelerated plugins (such as d3d11 and v4l2codecs), gstreamer-vaapi uses an internal, tightly coupled and convoluted library.
Also, note that right now (release 1.20) GstVA elements are ranked NONE, while gstreamer-vaapi ones are mostly PRIMARY+1.
Back to the three new elements in GstVA, the most complex one is vah264enc wrote almost completely by He Junyan, from Intel. For it, He had to write a H.264 bitwriter which is, until certain extend, the opposite for H.264 parser: construct the bitstream buffer from H.264 structures such as PPS, SPS, slice header, etc. This API is part of libgstcodecparsers, ready to be reused by other plugins or applications. Currently vah264enc is fairly complete and functional, dealing with profiles and rate controls, among other parameters. It still have rough spots, but we’re working on them. But He Junyan is restless and he already has in the pipeline an encoder common class along with a HEVC and AV1 encoders.
The second element is vacompositor, wrote by Artie Eoff. It’s the replacement of vaapioverlay in gstreamer-vaapi. The suffix compositor is preferred to follow the name of primary video mixing (software-based) element: compositor, successor of videomixer. See this discussion for further details. The purpose of this element is to compose a single video stream from multiple video streams. It works with Intel’s media-driver supporting alpha channel, and also works with AMD Mesa Gallium, but without alpha channel (in other words, a custom degree of transparency).
The last, but not the least, element is vajpegdec, which I worked on. The main issue was not the decoder itself, but jpegparse, which didn’t signal the image caps required for the hardware accelerated decoders. For instance, VA only decodes images with SOF marker 0 (Baseline DCT). It wasn’t needed before because the main and only consumer of the parser is jpegdec which deals with any type of JPEG image. Long story short, we revamped jpegparse and now it signals sof marker, color space (YUV, RGB, etc.) and chroma subsampling (if it has YUV color space), along with comments and EXIF-like metadata as pipeline’s tags. Thus vajpegdec will expose in caps template the supported color space and chroma subsampling supported by the driver. For example, Intel supports (more or less) RGB color space, while AMD Mesa Gallium don’t.
Since the switch to GStreamer mono repository, gst-buildhas been deprecated. The mechanism in WebKit were added, basically, to allow GStreamer upstream, so keeping gst-build directory just polluted the conceptual framework.
By using gst-build one could override almost any other package in WebKit SDK. For example, for developing gamepad handling in WPE I added libmanette as a GStreamer subproject, to link a modified version of the library rather than the one in flatpak. But that approach added an unneeded conceptual depth in tree.
In order to simplify these operations, by taking advantage of Meson’s subproject support directly, gst-build handling were removed and new mechanism was set in place: Local Dependencies. With local dependencies, you can add or override almost any dependency, while flatting the tree layout, by placing at the same level GStreamer and any other library. Of course, in order add dependencies, they must be built with meson.
For example, to override libsoup and GStreamer, just clone both repositories below of Tools/flatpak/local-projects/subprojects, and declare them in WEBKIT_LOCAL_DEPS environment variable:
As you may know the development environment used by WebKitGTK and WPE is based on Flatpak. I feel hacking software within Flatpak like teleoperating a rover in Mars, since I have to go through Flatpak commands to execute the commands I want to execute. The learning curve is steeper in exchange of a common development environment.
I started to work on another project where is required to use an NVIDIA GPU, without stopping to work on WebKitGTK/WPE. So I needed to use the card within Flatpak, and it’s well known that, currently, that setup is not available out-of-the-box. Furthermore, I have to use a very specific version of the graphics card drive for Vulkan.
As TingPing explained, flatpak does not use host libraries, that’s why it might need runtimes and extensions for specific hardware setups, with the libraries for user-space, such as NVIDIA GL platform runtime. And it must have the same version as the one running in kernel.
NVIDIA GL platform extension is a small project which generates flatpak’s runtimes for every public NVIDIA driver. The interesting part is that those runtimes are not created in building time, but at install-time. When the user installs the runtime, a driver blob is downloaded from NVIDIA servers (see --extra-data in flatpak build-finish for reference), and a small program is executed, which extracts the embedded tarball in the blob, and from it, it extracts the required libraries. In a few words, initially, the runtime is composed only by a definition of the file to
download, and the small program that populates the flatpak’s filesystem at install-time.
The trick here, that took me a lot to be aware, is that this small program has to be statically compiled, since it has be executed regardless the available runtime.
This little program uses libarchive to extract the libraries from NVIDIA’s tarball, but libarchive is not available statically in any flatpak’s SDK. Furthermore, our use of libarchive will depend on libz and liblzma, both statically compile as well. Gladly, there’s only one, very old version, obsolete, of freedesktop SDK, which offers static versions of libz and liblzma: 1.6. And that’s why org.freedesktop.Platform.GL.nvidia demands that specific old version of the SDK. Then, the manifest of the extension contains basically the static compilation of libarchive and the static compilation of the next-to-be apply_extra.
I needed to modify org.freedesktop.Platform.GL.nvidia sources a bit, since it, by default, consist in a big loop of downloading, hashing, templating a json manifest, and building, for every supported driver. But, as my case is just one custom driver, I don’t want to waste time in that loop. The hack to achieve it is fairly simple:
But in order to make it work, it needs a file in data/ directory with the specification of the file to download, with the format: NAME:SHA256:DOWNLOAD-SIZE:INSTALL-SIZE:URL.
One way to verify if the libraries are installed correctly and if they match with the driver running in the kernel’s host, is to install and run GreenWithEnvy:
$ flatpak install com.leinardi.gwe
$ flatpak run com.leinardi.gwe
If you want to install the driver in your WebKit development environment, you just need to set the environment variable FLATPAK_USER_DIR:
It was a year and half ago when I announced a new VA-API H.264 decoder element in gst-plugins-bad. And it was bundled in GStreamer release 1.18 a couple months later. Since then, we have been working adding more decoders and filters, fixing bugs, and enhancing its design. I wanted to publish this blog post as soon as release 1.20 was announced, but, since the developing window is closed, which means no more new features will be included, I’ll publish it now, to create buzz around the next GStreamer release.
Here’s the list of new GstVA decoders (of course, they are only available if your driver supports them):
vah265dec
vavp8dec
vavp9dec
vaav1dec
vampeg2dec
Also, there are a couple new features in vah264dec (common to all gstcodecs-based H.264 decoders):
Supports interlaced streams (vah265dec and vampeg2dec too).
Added a compliance property to trick the specification conformance for lower the latency, for example, or to enable non-standard features.
But not only decoders, there are two new elements for post-processing:
vapostproc
vadeinterlace
vapostproc is similar to vaapipostproc but without the interlace operation, since it was moved to another element. The reason for this is because there are deinterlacing methods which require to hold a list of referenced frames, thus these methods are broken in vaapipostproc, and adding them would increase the complexity of the element with no need. To keep things simple it’s better to handle deinterlacing in a different element.
This is the list of filters and features supported by vapostproc:
While vadeinterlace only does that, deinterlacing. But it supports all the available methods currently in the VA-API specification, using the new way to select the field to extract, since the old one (used by GStreamer-VAAPI and FFMPEG) is a bit more expensive.
Finally, both video filters, if they cannot handle the income format, they are configured in passthrough mode.
But there are not only new elements, there’s also a new library!
Since many others elements need to share a common VADisplay in the GStreamer pipeline, the new library expose only the GstVaDisplay object by now. The new library must be thin and lean, exposing only what it’s requested by other elements, such as gst-msdk. We have pending to merge after 1.20, for example, the add of GstContext helpers, and the plan is to expose the allocators and bufferpools later.
As I said in the previous blog post, all these elements are ranked as none, so the won’t be autoplugged, for example by playbin. To do so, users need to export the environment variable GST_PLUGIN_FEATURE_RANKas documented.
Thanks a bunch to He Junyan, Seungha Yang and Nicolas Dufresne, for all the effort and care.
Still, the to-do list is large enough. Just to share what I have in my notes:
Add a new upload method in glupload to interop with VA surfaces — though this hardly will be merged since it creates a circular dependency between -base and -bad.
vavc1dec — it might need a rewrite of vc1parse.
vajpegdec — it needs a rewrite of jpegparse.
vaalphacombine — decoding alpha channel with VA within vp9alphacodebin and vp8alphacodebin
vamixer — similar to compositor, glmixer or vaapioverlay, to compose a single frame from different video streams.
And encoders (mainly H.264 and H.265).
As a final mode, GStreamer-VAAPI has enter into maintenance mode. The general plan, without any promise or dates, is to deprecate it when most of its use cases were covered by GstVA.