Review of Igalia Multimedia activities (2022)

We, Igalia’s multimedia team, would like to share with you our list of achievements along the past 2022.

WebKit Multimedia

WebRTC

Phil already wrote a first blog post, of a series, on this regard: WebRTC in WebKitGTK and WPE, status updates, part I. Please, be sure to give it a glance, it has nice videos.

Long story short, last year we started to support Media Capture and Streams in WebKitGTK and WPE using GStreamer, either for input devices (camera and microphone), desktop sharing, webaudio, and web canvas. But this is just the first step. We are currently working on RTCPeerConnection, also using GStreamer, to share all these captured streams with other web peers. Meanwhile, we’ll wait for the second episode of Phil’s series 🙂

MediaRecorder

We worked in an initial implementation of MediaRecorder with GStreamer (1.20 or superior). The specification goes about allowing a web browser to record a selected stream. For example, a voice-memo or video application which could encode and upload a capture of your microphone / camera.

Gamepad

While WebKitGTK already has Gamepad support, WPE lacked it. We did the implementation last year, and there’s a blog post about it: Gamepad in WPEWebkit, with video showing a demo of it.

Capture encoded video streams from webcams

Some webcams only provide high resolution frames encoded in H.264 or so. In order to support these resolutions with those webcams we added the support for negotiate of those formats and decode them internally to handle the streams. Though we are just at the beginning of more efficient support.

Flatpak SDK maintenance

A lot of effort went to maintain the Flatpak SDK for WebKit. It is a set of runtimes that allows to have a reproducible build of WebKit, independently of the used Linux distribution. Nowadays the Flatpak SDK is used in Webkit’s EWS, and by many developers.

Among all the features added during the year we can highlight added Rust support, a full integrity check before upgrading, and offer a way to override dependencies as local projects.

MSE/EME enhancements

As every year, massive work was done in WebKit ports using GStreamer for Media Source Extensions and Encrypted Media Extensions, improving user experience with different streaming services in the Web, such as Odysee, Amazon, DAZN, etc.

In the case of encrypted media, GStreamer-based WebKit ports provide the stubs to communicate with an external Content Decryption Module (CDM). If you’re willing to support this in your platform, you can reach us.

Also we worked in a video demo showing how MSE/EME works in a Raspberry Pi 3 using WPE:

WebAudio demo

We also spent time recording video demos, such as this one, showing WebAudio using WPE on a desktop computer.

GStreamer

We managed to merge a lot of bug fixes in GStreamer, which in many cases can be harder to solve rather than implementing new features, though former are more interesting to tell, such as those related with making Rust the main developing language for GStreamer besides C.

Rust bindings and GStreamer elements for Vonage Video API / OpenTok

OpenTok is the legacy name of Vonage Video API, and is a PaaS (Platform As a Service) to ease the development and deployment of WebRTC services and applications.

We published our work in Github of Rust bindings both for the Client SDK for Linux and the Server SDK using REST API, along with a GStreamer plugin to publish and subscribe to video and audio streams.

GstWebRTCSrc

In the beginning there was webrtcbin, an element that implements the majority of W3C RTCPeerConnection API. It’s so flexible and powerful that it’s rather hard to use for the most common cases. Then appeared webrtcsink, a wrapper of webrtcbin, written in Rust, which receives GStreamer streams which will be offered and streamed to web peers. Later on, we developed webrtcsrc, the webrtcsink counterpart: an element which source pads push streams from web peers, such as another browser, and forward those Web streams as GStreamer ones in a pipeline. Both webrtcsink and webrtcsrc are written in Rust.

Behavior-Driven Development test framework for GStreamer

Behavior-Driven Development is gaining relevance with tools like Cucumber for Java and its domain specific language, Gherkin to define software behaviors. Rustaceans have picked up these ideas and developed cucumber-rs. The logical consequence was obvious: Why not GStreamer?

Last year we tinkered with GStreamer-Cucumber, a BDD to define behavior tests for GStreamer pipelines.

GstValidate Rust bindings

There have been some discussion if BDD is the best way to test GStreamer pipelines, and there’s GstValidate, and also, last year, we added its Rust bindings.

GStreamer Editing Services

Though not everything was Rust. We work hard on GStreamer’s nuts and bolts.

Last year, we gathered the team to hack GStreamer Editing Services, particularly to explore adding OpenGL and DMABuf support, such as downloading or uploading a texture before processing, and selecting a proper filter to avoid those transfers.

GstVA and GStreamer-VAAPI

We helped in the maintenance of GStreamer-VAAPI and the development of its near replacement: GstVA, adding new elements such as the H.264 encoder, the compositor and the JPEG decoder. Along with participation on the debate and code reviewing of negotiating DMABuf streams in the pipeline.

Vulkan decoder and parser library for CTS

You might have heard about Vulkan has now integrated in its API video decoding, while encoding is currently work-in-progress. We devoted time on helping Khronos with the Vulkan Video Conformance Tests (CTS), particularly with a parser based on GStreamer and developing a H.264 decoder in GStreamer using Vulkan Video API.

You can check the presentation we did last Vulkanised.

WPE Android Experiment

In a joint adventure with Igalia’s Webkit team we did some experiments to port WPE to Android. This is just an internal proof of concept so far, but we are looking forward to see how this will evolve in the future, and what new possibilities this might open up.

If you have any questions about WebKit, GStreamer, Linux video stack, compilers, etc., please contact us.

Video decoding in GStreamer with Vulkan Video extension (part 2)

Its has been a while since I reported my tinkering with the Vulkan Video provisional extension. Now the specification will have its final release soonish, and also there has been more engagement within the open source communities, such as the work-in-progress FFMpeg implementation by Lynne (please, please, read that post), and the also work-in-progress Mesa 3D drivers both for AMD and Intel by Dave Airlie! Along with the well known NVIDIA beta drivers for Vulkan.

From our side, we have been trying to provide an open source alternative to the video parser used by the Conformance Test Suite and the NVIDIA
vk_video_samples, using GStreamer: GstVkVideoParser, which intends to be a drop-in replacement of the current proprietary parser library.

Along the way, we have sketched the Vulkan Video support in
gfxreconstruct
, for getting traces of the API usage. Sadly, its kind of bit-rotten right now, even more because the specification has changed since then.

Regarding the H.264 decoder for GStreamer, we just restarted its hacking. The merge request was moved to monorepo, but for the sake of the well needed complete re-write, we changed the branch to this one (vkh264dec). We needed to re-write it because, besides the specification updates, we have learned many things along the journey, such as the out-of-band parameters update, Vulkan’s recommendation for memory pre-allocation as much as possible, the DPB/references handling, the debate about buffer vs. slice uploading, and other friction points that Lynne has spotted for future early adopters.

The way to compile it is grab the branch and compile as usually GStreamer is compiled with meson:

meson setup builddir -Dgst-plugins-bad:vulkan-video=enabled --buildtype=debug
ninja C builddir

And run simple pipelines such as

gst-launch-1.0 filesrc location=INPUT ! parsebin ! vulkanh264dec ! fakesink -v

Our objective is to have a functional demo for the next Vulkanised in
February
. We are very ambitious, we want it to work in Linux, Windows and in many GPU as possible. Wish us luck. And happy December festivities!

Video decoding in GStreamer with Vulkan

Warning: Vulkan video is still work in progress, from specification to available drivers and applications. Do not use it for production software just yet.

Introduction

Vulkan is a cross-platform Application Programming Interface (API), backed by the Khronos Group, aimed at graphics developers for a wide range of different tasks. The interface is described by a common specification, and it is implemented by different drivers, usually provided by GPU vendors and Mesa.

One way to visualize Vulkan, at first glance, is like a low-level OpenGL API, but better described and easier to extend. Even more, it is possible to implement OpenGL on top of Vulkan. And, as far as I am told by my peers in Igalia, Vulkan drivers are easier and cleaner to implement than OpenGL ones.

A couple years ago, a technical specification group (TSG), inside the Vulkan Working Group, proposed the integration of hardware accelerated video compression and decompression into the Vulkan API. In April 2021 the formed Vulkan Video TSG published an introduction to the
specification
. Please, do not hesitate to read it. It’s quite good.

Matthew Waters worked on a GStreamer plugin using Vulkan, mainly for uploading, composing and rendering frames. Later, he developed a library mapping Vulkan objects to GStreamer. This work was key for what I am presenting here. In 2019, during the last GStreamer Conference, Matthew delivered a talk about his work. Make sure to watch it, it’s worth it.

Other key components for this effort were the base classes for decoders and the bitstream parsing libraries in GStreamer, jointly developed by Intel, Centricular, Collabora and Igalia. Both libraries allow using APIs for stateless video decoding and encoding within the GStreamer framework, such as Vulkan Video, VAAPI, D3D11, and so on.

When the graphics team in Igalia told us about the Vulkan Video TSG, we decided to explore the specification. Therefore, Igalia decided to sponsor part of my time to craft a GStreamer element to decode H.264 streams using these new Vulkan extensions.

Assumptions

As stated at the beginning of this text, this development has to be considered unstable and the APIs may change without further notice.

Right now, the only Vulkan driver that offers these extensions is the beta NVIDIA driver. You would need, at least, version 455.50.12 for Linux, but it would be better to grab the latest one. And, of course, I only tested this on Linux. I would like to thank NVIDIA for their Vk Video samples. Their test application drove my work.

Finally, this work assumes the use of the main development branch of GStreamer, because the base classes for decoders are quite recent. Naturally, you can use gst-build for an efficient upstream workflow.

Work done

This work basically consists of two new objects inside the GstVulkan code:

  • GstVulkanDeviceDecoder: a GStreamer object in GstVulkan library, inherited from GstVulkanDevice, which enables VK_KHR_video_queue and VK_KHR_video_decode_queue extensions. Its purpose is to handle codec-agnostic operations.

  • vulkanh264dec: a GStreamer element, inherited from GstH264Decoder, which tries to instantiate a GstVulkanDeviceDecoder to composite it and is in charge of handling codec-specific operations later, such as matching the parsed structures. It outputs, in the source pad, memory:VulkanImage featured frames, with NV12 color format.

So far this pipeline works without errors:

$ gst-launch-1.0 filesrc location=big_buck_bunny_1080p_h264.mov ! parsebin ! vulkanh264dec ! fakesink

As you might see, the pipeline does not use vulkansink to render frames. This is because the Vulkan format output by the driver’s decoder device is VK_FORMAT_G8_B8R8_2PLANE_420_UNORM, which is NV12 crammed in a single image, while for GstVulkan a NV12 frame is a buffer with two images, one per component. So the current color conversion in GstVulkan does not support this Vulkan format. That is future work, among other things.

You can find the merge request for this work in GStreamer’s Gitlab.

Future work

As was mentioned before, it is required to fully support VK_FORMAT_G8_B8R8_2PLANE_420_UNORM format in GstVulkan. That requires thinking about how to keep backwards compatibility. Later, an implementation of the sampler to convert this format to RGB will be needed, so that decoded frames can be rendered by vulkansink.

Also, before implementing any new feature, the code and its abstractions will need to be cleaned up, since currently the division between codec-specific and codec-agnostic code is not strict, and it must be fixed.

Another important cleanup task is to enhance the way the Vulkan headers are handled. Since the required headers files for video extensions are beta, they are not expected to be available in the system, so temporally I had to add the those headers as part of the GstVulkan library.

Then it will be possible to implement the H.265 decoder, since the NVIDIA driver also supports it.

Later on, it will be nice to start thinking about encoders. But this requires extending support for stateless encoders in GStreamer, something I want do to for the new VAAPI plugin too.

Thanks for bearing with me, and thanks to Igalia for sponsoring this work.