Video decoding in GStreamer with Vulkan

Warning: Vulkan video is still work in progress, from specification to available drivers and applications. Do not use it for production software just yet.

Introduction

Vulkan is a cross-platform Application Programming Interface (API), backed by the Khronos Group, aimed at graphics developers for a wide range of different tasks. The interface is described by a common specification, and it is implemented by different drivers, usually provided by GPU vendors and Mesa.

One way to visualize Vulkan, at first glance, is like a low-level OpenGL API, but better described and easier to extend. Even more, it is possible to implement OpenGL on top of Vulkan. And, as far as I am told by my peers in Igalia, Vulkan drivers are easier and cleaner to implement than OpenGL ones.

A couple years ago, a technical specification group (TSG), inside the Vulkan Working Group, proposed the integration of hardware accelerated video compression and decompression into the Vulkan API. In April 2021 the formed Vulkan Video TSG published an introduction to the
specification
. Please, do not hesitate to read it. It’s quite good.

Matthew Waters worked on a GStreamer plugin using Vulkan, mainly for uploading, composing and rendering frames. Later, he developed a library mapping Vulkan objects to GStreamer. This work was key for what I am presenting here. In 2019, during the last GStreamer Conference, Matthew delivered a talk about his work. Make sure to watch it, it’s worth it.

Other key components for this effort were the base classes for decoders and the bitstream parsing libraries in GStreamer, jointly developed by Intel, Centricular, Collabora and Igalia. Both libraries allow using APIs for stateless video decoding and encoding within the GStreamer framework, such as Vulkan Video, VAAPI, D3D11, and so on.

When the graphics team in Igalia told us about the Vulkan Video TSG, we decided to explore the specification. Therefore, Igalia decided to sponsor part of my time to craft a GStreamer element to decode H.264 streams using these new Vulkan extensions.

Assumptions

As stated at the beginning of this text, this development has to be considered unstable and the APIs may change without further notice.

Right now, the only Vulkan driver that offers these extensions is the beta NVIDIA driver. You would need, at least, version 455.50.12 for Linux, but it would be better to grab the latest one. And, of course, I only tested this on Linux. I would like to thank NVIDIA for their Vk Video samples. Their test application drove my work.

Finally, this work assumes the use of the main development branch of GStreamer, because the base classes for decoders are quite recent. Naturally, you can use gst-build for an efficient upstream workflow.

Work done

This work basically consists of two new objects inside the GstVulkan code:

  • GstVulkanDeviceDecoder: a GStreamer object in GstVulkan library, inherited from GstVulkanDevice, which enables VK_KHR_video_queue and VK_KHR_video_decode_queue extensions. Its purpose is to handle codec-agnostic operations.

  • vulkanh264dec: a GStreamer element, inherited from GstH264Decoder, which tries to instantiate a GstVulkanDeviceDecoder to composite it and is in charge of handling codec-specific operations later, such as matching the parsed structures. It outputs, in the source pad, memory:VulkanImage featured frames, with NV12 color format.

So far this pipeline works without errors:

$ gst-launch-1.0 filesrc location=big_buck_bunny_1080p_h264.mov ! parsebin ! vulkanh264dec ! fakesink

As you might see, the pipeline does not use vulkansink to render frames. This is because the Vulkan format output by the driver’s decoder device is VK_FORMAT_G8_B8R8_2PLANE_420_UNORM, which is NV12 crammed in a single image, while for GstVulkan a NV12 frame is a buffer with two images, one per component. So the current color conversion in GstVulkan does not support this Vulkan format. That is future work, among other things.

You can find the merge request for this work in GStreamer’s Gitlab.

Future work

As was mentioned before, it is required to fully support VK_FORMAT_G8_B8R8_2PLANE_420_UNORM format in GstVulkan. That requires thinking about how to keep backwards compatibility. Later, an implementation of the sampler to convert this format to RGB will be needed, so that decoded frames can be rendered by vulkansink.

Also, before implementing any new feature, the code and its abstractions will need to be cleaned up, since currently the division between codec-specific and codec-agnostic code is not strict, and it must be fixed.

Another important cleanup task is to enhance the way the Vulkan headers are handled. Since the required headers files for video extensions are beta, they are not expected to be available in the system, so temporally I had to add the those headers as part of the GstVulkan library.

Then it will be possible to implement the H.265 decoder, since the NVIDIA driver also supports it.

Later on, it will be nice to start thinking about encoders. But this requires extending support for stateless encoders in GStreamer, something I want do to for the new VAAPI plugin too.

Thanks for bearing with me, and thanks to Igalia for sponsoring this work.

Review of Igalia Multimedia activities (2020/H2)

As the first quarter of 2021 has aready come to a close, we reckon it’s time to recap our achievements from the second half of 2020, and update you on the improvements we have been making to the multimedia experience on the Web and Linux in general.

Our previous reports:

WPE / WebKitGTK

We have closed ~100 issues related with multimedia in WebKitGTK/WPE, such as fixed seek issues while playback, plugged memory leaks, gardening tests, improved Flatpak-based developing work-flow, enabled new codecs, etc.. Overall, we improved a bit the multimedia’s user experience on these Webkit engine ports.

To highlight a couple tasks, we did some maintenance work on WebAudio backends, and we upstreamed an internal audio mixer, keeping only one connection to the audio server, such as PulseAudio, instead of multiple connections, one for every audio resource. The mixer combines all streams into a single audio server connection.

Adaptive media streaming for the Web (MSE)

We have been working on a new MSE backend for a while, but along the way many related bugs have appeared and they were squashed. Also many code cleanups has been carried out. Though it has been like yak shaving, we are confident that we will reach the end of this long and winding road soonish.

DRM media playback for the Web (EME)

Regarding digital protected media playback, we worked to upstream OpenCDM, support with Widevine, through RDK’s Thunder framework, while continued with the usual maintenance of the others key systems, such as Clear Key, Widevine and PlayReady.

For more details we published a blog post: Serious Encrypted Media Extensions on GStreamer based WebKit ports.

Realtime communications for the Web (WebRTC)

Just as EME, WebRTC is not currently enabled by default in browsers such as Epiphany because license problems, but they are available for custom adopters, and we are maintaining it. For example, we collaborated to upgrade LibWebRTC to M87 and fixed the expected regressions and gardening.

Along the way we experimented a bit with the new GPUProcess for capture devices, but we decided to stop the experimentation while waiting for a broader adoption of the process, for example in graphics rendering, in WPE/WebKitGTK.

GPUProcess work will be retaken at some point, because it’s not, currently, a hard requirement, since we already have moved capture devices handling from the UIProcess to the WebProcess, isolating all GStreamer operations in the latter.

GStreamer

GStreamer is one of our core multimedia technologies, and we contribute on it on a daily basis. We pushed ~400 commits, with similar number of code reviews, along the second half of 2020. Among of those contributions let us highlight the following list:

  • A lot of bug fixing aiming for release 1.18.
  • Reworked and enhanced decodebin3, the GstTranscoder
    API
    and encodebin.
  • Merged av1parse in video parsers plugin.
  • Merged qroverlay plugin.
  • Iterated on the mono-repo
    proposal, which requires consensus and coordination among the whole community.
  • gstwpe element has been greatly improved from new user requests.
  • Contributed on the new libgstcodecs library, which enables stateless video decoders through different platforms (for example, v4l2, d3d11, va, etc.).
  • Developed a new plugin for VA-API using this library, exposing H.264, H.265, VP9, VP8, MPEG2 decoders and a full featured postprocessor, with better performance, according our measurements, than GStreamer-VAAPI.

Conferences

Despite 2020 was not a year for conferences, many of them went virtual. We attended one, the Mile high video conference, and participated in the Slack workspace.

Thank you for reading this report and stay tuned with our work.

Notes on using Emacs (LSP/ccls) for WebKit

I used to regard myself as an austere programmer in terms of tooling: Emacs —with a plain configuration— and grep. This approach forces you to understand all the elements involved in a project.

Some time ago I have to code in Rust, so I needed to learn the language as fast as possible. I looked for packages in MELPA that could help me to be productive quickly. Obviously, I installed rust-mode, but I also found racer for auto-completion. I tried it out. It was messy to setup and unstable, but it helped me to code while learning. When I felt comfortable with the base code, I uninstalled it.

This year I returned to work on WebKit. The last time I contributed to it was around five years ago, but now in a different area (still in the multimedia stack). WebKit is huge, and because of C++, I found gtags rather limited. Out of curiosity I looked for something similar to racer but for C++. And I spent a while digging on it.

The solution consists in the integration of three MELPA packages:

  • lsp-mode: a client for Language Server Protocol for Emacs.
  • company-mode: a text completion framework.
  • ccls: A C/C++ language server. Besides emacs-ccls adds more functionality to lsp-mode.

(I known, there’s a simpler alternative to lsp-mode, but I haven’t tried it yet).

First we might explain what’s LSP. It stands for Language Server Protocol, defined with JSON-RPC messages, between the editor and the language server. It was orginally developed by Microsoft for Visual Studio, which purpose is to support auto-completion, finding symbol’s definition, to show early error markers, etc., inside the editor. Therefore, lsp-mode is an Emacs mode that communicates with different language servers in LSP and operates in Emacs accordingly.

In order to support the auto-completion use-case lsp-mode uses the company-mode. This Emacs mode is capable to create a floating context menu where the editing cursor is placed.

The third part of the puzzle is, of course, the language server. There’s a language servers for different programming languages. For C & C++ there are two servers: clangd and ccls. The former uses Clang compiler, the last can use either Clang, GCC or MSVC. Along this text ccls will be used for reasons exposed later. In between, emacs-ccls leverages and extends the support of ccls in lsp-mode, though it’s not mandatory.

In short, the basic .emacs configuration, using use-package, would have these lines:


(use-package company
  :diminish
  :config (global-company-mode 1))

(use-package lsp-mode
  :diminish "L"
  :init (setq lsp-keymap-prefix "C-l"
              lsp-enable-file-watchers nil
              lsp-enable-on-type-formatting nil
              lsp-enable-snippet nil)
  :hook (c-mode-common . lsp-deferred)
  :commands (lsp lsp-deferred))

(use-package ccls
  :init (setq ccls-sem-highlight-method 'font-lock)
  :hook ((c-mode c++-mode objc-mode) . (lambda () (require 'ccls) (lsp-deferred))))

The snippet first configures company-mode. It is enabled globally because, normally, it is a nice feature to have, even in non-coding buffers, such as this very one, for writing a blog post in markdown format. Diminish mode hides or abbreviates the mode description in the Emacs’ mode line.

Later comes lsp-mode. It’s big and aims to do a lot of things, basically we have to tell it to disable certain features, such as file watcher, something not viable in massive projects as WebKit; as I don’t use snippet (generic text templates), I also disable it; and finally, lsp-mode tries to format the code at typing, I don’t know how the code style is figured out, but in my experience, it’s always detected wrong, so I disabled it too. Finally, lsp-mode is launched when a text uses the c-mode-common, shared by c++-mode too. lsp-mode is launched deferred, meaning it’ll startup until the buffer is visible; this is important since we might want to delay ccls session creation until the buffer’s .dir-locals.el file is processed, where it is configured for the specific project.

And lastly, ccls-mode configuration, hooked until c-mode or c++-mode are loaded up in a deferred fashion (already explained).

It’s important to understand how ccls works in order to integrate it in our workflow of a specific project, since it might need to be configured using Emacs’ per-directory local variales.

We are living in a post-Makefile world (almost), proof of that is ccls, which instead of a makefile, it uses a compilation database, a record of the compile options used to build the files in a project. It’s commonly described in JSON and it’s generated automatically by build systems such as meson or cmake, and later consumed by ninja or ccls to execute the compilation. Bear in mind that ccls uses a cache, which can eat a couple gigabytes of disk.

Now, let’s review the concrete details of using these features with WebKit. Let me assume that WebKit local repository is cloned in ~/WebKit.

As you may know, the cool way to compile WebKit is with flatpak. Flatpak adds an indirection in the compilation process, since it’s done in an isolated environment, above the native system. As a consequence, ccls has to be the one inside the Flatpak environment. In ~/.local/bin/webkit-ccls:

#!/bin/sh
set -eu
cd $HOME/WebKit/
exec Tools/Scripts/webkit-flatpak -c ccls "$@"

Basically the scripts calls ccls inside flatpak, which is available in the SDK. And this is why ccls instead of clang, since clang is not provided.

By default ccls assumes the compilation database is in the project’s root directory, but in our case, it’s not, thus it is required to configure the database directory for our WebKit setup. For it, as we already said, a .dir-locals.el file is used.


((c-mode
  (indent-tabs-mode . nil)
  (c-basic-offset . 4))
 (c++-mode
  (indent-tabs-mode . nil)
  (c-basic-offset . 4))
 (java-mode
  (indent-tabs-mode . nil)
  (c-basic-offset . 4))
 (change-log-mode
  (indent-tabs-mode . nil))
 (nil
  (fill-column . 100)
  (ccls-executable . "/home/vjaquez/.local/bin/webkit-ccls")
  (ccls-initialization-options . (:compilationDatabaseDirectory "/app/webkit/WebKitBuild/Release"
                                  :cache (:directory ".ccls-cache")))
  (compile-command . "build-webkit --gtk --debug")))

As you can notice, ccls-execute is defined here, though it’s not a safe local variable. Also the ccls-initialization-options, which is a safe local variable. It is important to notice that the compilation database directory is a path inside flatpak, and always use the Release path. I don’t understand why, but Debug path didn’t work for me. This mean that WebKit should be compiled as Release frequently, even if we only use Debug type for coding (as you may see in my compile-command).

Update: Now we can explain why it’s important to configure lsp-mode as deferred: to avoid connections to ccls before processing the .dir-locals.el file.

And that’s all. Now I have early programming errors detection, auto-completion, and so on. I hope you find these notes helpful.

Update: Sadly, because of flatpak indirection, symbols’ definition finding won’t work because the file paths stored in ccls cache are relative to flatpak’s file system. For that I still rely on global and its Emacs mode.

Review of Igalia Multimedia activities (2020/H1)

This blog post is a review of the various activities the Igalia Multimedia team was involved in during the first half of 2020.

Our previous reports are:

Just before a new virus turned into pandemics we could enjoy our traditional FOSDEM. There, our colleague Phil gave a talk about many of the topics covered in this report.

GstWPE

GstWPE’s wpesrc element, produces a video texture representing a web page rendered off-screen by WPE.

We have worked on a new iteration of the GstWPE demo, focusing on one-to-many, web-augmented overlays, broadcasting with WebRTC and Janus.

Also, since the merge of gstwpe plugin in gst-plugins-bad (staging area for new elements) new users have come along spotting rough areas and improving the element along the way.

Video Editing

GStreamer Editing Services (GES) is a library that simplifies the creation of multimedia editing applications. It is based on the GStreamer multimedia framework and is heavily used by Pitivi video editor.

Implemented frame accuracy in the GStreamer Editing Services (GES)

As required by the industry, it is now possible to reference all time in frame number, providing a precise mapping between frame number and play time. Many issues were fixed in GStreamer to reach the precision enough for make this work. Also intensive regression tests were added.

Implemented time effects support in GES

Important refactoring inside GStreamer Editing Services have happened to allow cleanly and safely change playback speed of individual clips.

Implemented reverse playback in GES

Several issues have been fixed inside GStreamer core elements and base classes in order to support reverse playback. This allows us to implement reliable and frame accurate reverse playback for individual clips.

Implemented ImageSequence support in GStreamer and GES

Since OpenTimelineIO implemented ImageSequence support, many users in the community had said it was really required. We reviewed and finished up imagesequencesrc element, which had been awaiting review for years.

This feature is now also supported in the OpentimelineIO GES adapater.

Optimized nested timelines preroll time by an order of magnitude

Caps negotiation, done while the pipeline transitions from pause state to playing state, testing the whole pipeline functionality, was the bottleneck for nested timelines, so pipelines were reworked to avoid useless negotiations. At the same time, other members of GStreamer community have improved caps negotiation performance in general.

Last but not least, our colleague Thibault gave a talk in The Pipeline Conference about The Motion Picture Industry and Open Source Software: GStreamer as an Alternative, explaining how and why GStreamer could be leveraged in the motion picture industry to allow faster innovation, and solve issues by reusing all the multi-platform infrastructure the community has to offer.

WebKit multimedia

There has been a lot of work on WebKit multimedia, particularly for WebKitGTK and WPE ports which use GStreamer framework as backend.

WebKit Flatpak SDK

But first of all we would like to draw readers attention to the new WebKit Flatpak SDK. It was not a contribution only from the multimedia team, but rather a joint effort among different teams in Igalia.

Before WebKit Flatpak SDK, JHBuild was used for setting up a WebKitGTK/WPE environment for testing and development. Its purpose to is to provide a common set of well defined dependencies instead of relying on the ones available in the different Linux distributions, which might bring different outputs. Nonetheless, Flatpak offers a much more coherent environment for testing and develop, isolated from the rest of the building host, approaching to reproducible outputs.

Another great advantage of WebKit Flatpak SDK, at least for the multimedia team, is the possibility of use gst-build to setup a custom GStreamer environment, with latest master, for example.

Now, for sake of brevity, let us sketch an non-complete list of activities and achievements related with WebKit multimedia.

General multimedia

Media Source Extensions (MSE)

Encrypted Media Extension (EME)

One of the major results of this first half, is the upstream of ThunderCDM, which is an implementation of a Content Decryption Module, providing Widevine decryption support. Recently, our colleague Xabier, published a blog post on this regard.

And it has enabled client-side video rendering support, which ensures video frames remain protected in GPU memory so they can’t be reached by third-party. This is a requirement for DRM/EME.

WebRTC

GStreamer

Though we normally contribute in GStreamer with the activities listed above, there are other tasks not related with WebKit. Among these we can enumerate the following:

GStreamer VAAPI

  • Reviewed a lot of patches.
  • Support for media-driver (iHD), the new VAAPI driver for Intel, mostly for Gen9 onwards. There are a lot of features with this driver.
  • A new vaapioverlay element.
  • Deep code cleanups. Among these we would like to mention:
    • Added quirk mechanism for different backends.
    • Change base classes to GstObject and GstMiniObject of most of classes and buffers types.
  • Enhanced caps negotiation given current driver’s constraints

Conclusions

The multimedia team in Igalia has keep working, along the first half of this strange year, in our three main areas: browsers (mainly on WebKitGTK and WPE), video editing and GStreamer framework.

We worked adding and enhancing WebKitGTK and WPE multimedia features in order to offer a solid platform for media providers.

We have enhanced the Video Editing support in GStreamer.

And, along these tasks, we have contribuited as much in GStreamer framework, particulary in hardware accelerated decoding and encoding and VA-API.

New VA-API H.264 decoder in gst-plugins-bad

Recently, a new H.264 decoder, using VA-API, was merged in gst-plugins-bad.

Why another VA-based H.264 decoder if there is already gstreamer-vaapi?

As usual, an historical perspective may give some clues.

It started when Seungha Yang implemented the GStreamer decoders for Windows using DXVA2 and D3D11 APIs.

Perhaps we need one step back and explain what are stateless decoders.

Video decoders are magic and opaque boxes where we push encoded frames, and later we’ll pop full decoded frames in raw format. This is how OpenMAX and V4L2 decoders work, for example.

Internally we can imagine those magic and opaque boxes has two main operations:

  • Codec state handling
  • Signal processing like Fourier-related transformations (such as DCT), entropy coding, etc. (DSP, in general)

The codec state handling basically extracts, from the stream, the frame’s parameters and its compressed data, so the DSP algorithms can decode the frames. Codec state handling can be done with generic CPUs, while DSP algorithms are massively improved through specific purpose processors.

These video decoders are known as stateful decoders, and usually they are distributed through binary and closed blobs.

Soon, silicon vendors realized they can offload the burden of state handling to third-party user-space libraries, releasing what it is known as stateless decoders. With them, your code not only has to push frames into the opaque box, but now it shall handle the codec specifics to provide all the parameters and references for each frame. VAAPI and DXVA2 are examples of those stateless decoders.

Returning to Seungha’s implementation, in order to get theirs DXVA2/D3D11 decoders, they also needed a state handler library for each codec. And Seungha wrote that library!

Initially they wanted to reuse the state handling in gstreamer-vaapi, which works pretty good, but its internal library, from the GStreamer perspective, is over-engineered: it is impossible to rip out only the state handling without importing all its data types. Which is kind of sad.

Later, Nicolas Dufresne, realized that this library can be re-used by other GStreamer plugins, because more stateless decoders are now available, particularly V4L2 stateless, in which he is interested. Nicolas moved Seungha’s code into a library in gst-plugins-bad.

Currently, libgstcodecs provides state handling of H.264, H.265, VP8 and VP9.

Let’s return to our original question: Why another VA-based H.264 decoder if there is already one in gstreamer-vaapi?

The quick answer is «to pay my technical debt».

As we already mentioned, gstreamer-vaapi is big and over-engineered, though we have being simplifying the internal libraries, in particular He Junyan, has done a lot of work replacing the internal base class, GstVaapiObject, withGstObject or GstMiniObject. Also, this kind of projects, where there’s a lot of untouched code, it carries a lot of cargo cult decisions.

So I took the libgstcodecs opportunity to write a simple, thin and lean, H.264 decoder, using VA new API calls (vaExportSurfaceHandle(), for example) and learning from other implementations, such as FFMpeg and ChromeOS. This exercise allowed me to identify where are the dusty spots in gstreamer-vaapi and how they should be fixed (and we have been doing it since then!).

Also, this opportunity lead me to learn a bit more about the H.264 specification since I implemented the reference picture list handling, and fixed a small bug in Chromium.

Now, let me be crystal clear: GStreamer VA-API is not going anywhere. It is, right now, one of the most feature-complete implementations using VA-API, even with its integration issues, and we are working on them, particularly, Intel folks are working hard on a new AV1 decoder, enhancing encoders and adding new video post-processing features.

But, this new vah264dec is an experimental VA-API decoder, which aims towards a tight integration with GStreamer, oriented to provide a good experience in most of the common use cases and to enhance the common libgstcodecs library shared with other stateless decoders, looking to avoid Intel specific nuances.

These are the main characteristics and plans of this new decoder:

  • It use, by default, a DRM connection to VA display, avoiding the troubles of choosing X11 or Wayland.
    • It uses the first found DRM device as VA display
    • In the future, users will be able to provide their custom VA display through the pipeline’s context.
  • It requires libva >= 1.6
  • No multiview/stereo profiles, neither interlaced streams, because libgstcodecs doesn’t handle them yet
  • It is incompatible with gstreamer-vaapi: mixing elements might lead to problems.
  • Even if memory:VAMemory is exposed, it is not handled yet by any other element yet.
    • Users will get VASurfaces via mapping as GstGL does with textures.
  • Caps templates are generated dynamically generated by querying VAAPI
  • YV12 and I420 are added for system memory caps because they seem to be supported for all the drivers when downloading frames onto main memory, as they are used by xvimagesink and others, avoiding color conversion.
  • Decoding surfaces aren’t bounded to context, so they can grow beyond the DBP size, allowing smooth reverse playback.
  • There isn’t yet error handling and recovery.
  • The element is supposed to spawn if different renderD nodes with VA-API driver support are found (like gstv4l2), but it hasn’t been tested yet.

Now you may be asking how do I use vah264dec?

Currently vah264dec has NONE rank, which means that it will never be autoplugged, but you can use the trick of the environment variable GST_PLUGIN_FEATURE_RANK:

$ GST_PLUGIN_FEATURE_RANK=vah264dec:259 gst-play-1.0 ~/video.mp4

And that’s it!

Thanks.

WebKit Flatpak SDK and gst-build

This post is an annex of Phil’s Introducing the WebKit Flatpak SDK. Please make sure to read it, if you haven’t already.

Recapitulating, nowadays WebKitGtk/WPE developers —and their CI infrastructure— are moving towards to Flatpak-based environment for their workflow. This Flatpak-based environment, or Flatpak SDK for short, can be visualized as a software sandboxed-container, which bundles all the dependencies required to compile, run and debug WebKitGtk/WPE.

In a day-by-day work, this approach removes the potential compilation of the world in order to obtain reproducible builds, improving the development and testing work flow.

But what if you are also involved in the development of one dependency?

This is the case of Igalia’s multimedia team where, besides developing the multimedia features for WebKitGtk and WPE, we also participate in the GStreamer development, the framework used for multimedia.

Because of this, in our workflow we usually need to build WebKit with a fix, hack or new feature in GStreamer. Is it possible to add in Flatpak our custom GStreamer build without messing its own GStreamer setup? Yes, it’s possible.

gst-build is a set of scripts in Python which clone GStreamer repositories, compile them and setup an uninstalled environment. This uninstalled environment allows a transient usage of the compiled framework from their build tree, avoiding installation and further mess up with our system.

The WebKit scripts that wraps Flatpak operations are also capable to handle the scripts of gst-build to build GStreamer inside the container, and, when running WebKit’s artifacts, the scripts enable the mentioned uninstalled environment, overloading Flatpak’s GStreamer.

How do we unveil all this magic?

First of all, setup a gst-build installation as it is documented. In this installation is were the GStreamer plumbing is done.

Later, gst-build operations through WebKit compilation scripts are enabled when the environment variable GST_BUILD_PATH is exported. This variable should point to the directory where the gst-build tree is placed.

And that’s all!

But let’s put these words in actual commands. The following workflow assumes that WebKit repository is cloned in ~/WebKit and the gst-build tree is in ~/gst-build (please, excuse my bashisms).

Compiling WebKitGtk with symbols, using LLVM as toolchain (this command will also compile GStreamer):

$ cd ~/WebKit
% CC=clang CXX=clang++ GST_BUILD_PATH=/home/vjaquez/gst-build Tools/Scripts/build-webkit --gtk --debug
...

Running the generated minibrowser (remind GST_BUILD_PATH is required again for a correct linking):

$ GST_BUILD_PATH=/home/vjaquez/gst-build Tools/Scripts/run-minibrowser --gtk --debug
...

Running media layout tests:

$ GST_BUILD_PATH=/home/vjaquez/gst-build ./Tools/Scripts/run-webkit-tests --gtk --debug media

But wait! There’s more...

What if you I want to parametrize the GStreamer compilation. To say, I would like to enable a GStreamer module or disable the built of a specific element.

gst-build, as the rest of GStreamer modules, uses meson build system, so it’s possible to pass arguments to meson through the environment variable GST_BUILD_ARGS.

For example, I would like to enable gstreamer-vaapi 😇

$ cd ~/WebKit
% CC=clang CXX=clang++ GST_BUILD_PATH=/home/vjaquez/gst-build GST_BUILD_ARGS="-Dvaapi=enabled" Tools/Scripts/build-webkit --gtk --debug
...

Review of the Igalia Multimedia team Activities (2019/H2)

This blog post is a review of the various activities the Igalia Multimedia team was involved along the second half of 2019.

Here are the previous 2018/H2 and 2019/H1 reports.

GstWPE

Succinctly, GstWPE is a GStreamer plugin which allows to render web-pages as a video stream where it frames are GL textures.

Phil, its main author, wrote a blog post explaning at detail what is GstWPE and its possible use-cases. He wrote a demo too, which grabs and previews a live stream from a webcam session and blends it with an overlay from wpesrc, which displays HTML content. This composited live stream can be broadcasted through YouTube or Twitch.

These concepts are better explained by Phil himself in the following lighting talk, presented at the last GStreamer Conference in Lyon:

Video Editing

After implementing a deep integration of the GStreamer Editing Services (a.k.a GES) into Pixar’s OpenTimelineIO during the first half of 2019, we decided to implement an important missing feature for the professional video editing industry: nested timelines.

Toward that goal, Thibault worked with the GSoC student Swayamjeet Swain to implement a flexible API to support nested timelines in GES. This means that users of GES can now decouple each scene into different projects when editing long videos. This work is going to be released in the upcoming GStreamer 1.18 version.

Henry Wilkes also implemented the support for nested timeline in OpenTimelineIO making GES integration one of the most advanced one as you can see on that table:

Feature OTIO EDL FCP7 XML FCP X AAF RV ALE GES
Single Track of Clips W-O
Multiple Video Tracks W-O
Audio Tracks & Clips W-O
Gap/Filler
Markers N/A
Nesting W-O
Transitions W-O
Audio/Video Effects N/A
Linear Speed Effects R-O
Fancy Speed Effects
Color Decision List N/A

Along these lines, Thibault delivered a 15 minutes talk, also in the GStreamer Conference 2019:

After detecting a few regressions and issues in GStreamer, related to frame accuracy, we decided to make sure that we can seek in a perfectly frame accurate way using GStreamer and the GStreamer Editing Services. In order to ensure that, an extensive integration testsuite has been developed, mostly targeting most important container formats and codecs (namely mxf, quicktime, h264, h265, prores, jpeg) and issues have been fixed in different places. On top of that, new APIs are being added to GES to allow expressing times in frame number instead of nanoseconds. This work is still ongoing but should be merged in time for GStreamer 1.18.

GStreamer Validate Flow

GstValidate has been turning into one of the most important GStreamer testing tools to check that elements behave as they are supposed to do in the framework.

Along with our MSE work, we found that other way to specify tests, related with produced buffers and events through specific pads, was needed. Thus, Alicia developed a new plugin for GstValidate: Validate Flow.

Alicia gave an informative 30 minutes talk about GstValidate and the new plugin in the last GStreamer Conference too:

GStreamer VAAPI

Most of the work along the second half of 2019 were maintenance tasks and code reviews.

We worked mainly on memory restrictions per backend driver, and we reviewed a big refactor: internal encoders now use GstObject, instead of the custom GstVaapiObject. Also we reviewed patches for new features such as video rotation and cropping in vaapipostproc.

Servo multimedia

Last year we worked integrating media playing in Servo. We finally delivered hardware accelerated video playback in Linux and Android. We worked also for Windows and Mac ports but they were not finished. As natural, most of the work were in servo/media crate, pushing code and reviewing contributions. The major tasks were to rewrite the media player example and the internal source element looking to handle the download playbin‘s flag properly.

We also added WebGL integration support with <video> elements, thus webpages can use video frames as WebGL textures.

Finally we explored how to isolate the multimedia processing in a dedicated thread or process, but that task remains pending.

WebKit Media Source Extension

We did a lot of downstream and upstream bug fixing and patch review, both in WebKit and GStreamer, for our MSE GStreamer-based backend.

Along this line we improved WebKitMediaSource to use playbin3 but also compatibility with older GStreamer versions was added.

WebKit WebRTC

Most of the work in this area were maintenance and fix regressions uncovered by the layout tests. Besides, the support for the Rasberry Pi was improved by handling encoded streams from v4l2 video sources, with some explorations with Minnowboard on top of that.

Conferences

GStreamer Conference

Igalia was Gold sponsor this last GStreamer Conference held in Lyon, France.

All team attended and five talks were delivered. Only Thibault presented, besides the video editing one which we already referred, another two more: One about GstTranscoder API and the other about the new documentation infrastructure based in Hotdoc:

We also had a productive hackfest, after the conference, where we worked on AV1 Rust decoder, HLS Rust demuxer, hardware decoder flag in playbin, and other stuff.

Linaro Connect

Phil attended the Linaro Connect conference in San Diego, USA. He delivered a talk about WPE/Multimedia which you can enjoy here:

Demuxed

Charlie attended Demuxed, in San Francisco. The conference is heavily focused on streaming and codec engineering and validation. Sadly there are not much interest in GStreamer, as the main focus is on FFmpeg.

RustFest

Phil and I attended the last RustFest in Barcelona. Basically we went to meet with the Rust community and we attended the “WebRTC with GStreamer-rs” workshop presented by Sebastian Dröge.

GStreamer-VAAPI 1.16 and libva 2.6 in Debian

Debian has migrated libva 2.6 into testing. This release includes a pull request that changes how the drivers are selected to be loaded and used. As the pull request mentions:

libva will try to load iHD firstly, if it failed. then it will load i965.

Also, Debian testing has imported that iHD driver with two flavors: intel-media-driver and intel-media-driver-non-free. So basically iHD driver is now the main VAAPI driver for Intel platforms, though it only supports the new chips, the old ones still require i965-va-driver.

Sadly, for current GStreamer-VAAPI stable, the iHD driver is not included in its driver white list. And this will pose a problem for users that have installed either of the intel-media-driver packages, because, by default, such driver is ignored and the VAAPI GStreamer elements won’t be registered.

There are three temporal workarounds (mutually excluded) for those users (updated):

  1. Uninstall intel-media-driver* and install (or keep) the old i965-va-driver-shaders/i965-va-driver.
  2. Export, by default in your session, export LIBVA_DRIVER_NAME=i965. Normally this is done adding the variable exportation in $HOME/.profile file. This environment variable will force libva to load the i965 driver.
  3. And finally, export, by default in your sessions, GST_VAAPI_ALL_DRIVERS=1. This is not advised since many applications, such as Epiphany, might fail.

We prefer to not include iHD in the stable white list because most of the work done for that driver has occurred after release 1.16.

In the case of GStreamer-VAAPI master branch (actively in develop) we have merged the iHD in the white list, since the Intel team has been working a lot to make it work. Though, it will be released for GStreamer version 1.18.

Generating a GStreamer-1.14 bundle for TravisCI with Ubuntu/Trusty

For having continous integration in your multimedia project hosted in GitHub with TravisCI, you may want to compile and run tests with a recent version of GStreamer. Nonetheless, TravisCI mainly offers Ubuntu Trusty as one of the possible distributions to deploy in their CI, and that distribution packages GStreamer 1.2, which might be a bit old for your project’s requirements.

A solution for this problem is to provide to TravisCI your own GStreamer bundle with the version you want to compile and test on you project, in this case 1.14. The present blog is recipe I followed to generate that GStreamer bundle with GstGL support.

There are three main issues:

  1. The packaged libglib version is too old, hoping that we will not find an ABI breakage while running the CI.
  2. The packaged ffmpeg version is too old
  3. As we want to compile GStreamer using gst-build, we need a recent version of meson, which requires python3.5, not available in Trusty.

schroot

Old habits die hard, and I have used schroot for handle chroot environments without complains, it handles the bind mounting of /proc, /sys and all that repetitive stuff that seals the isolation of the chrooted environment.

The debootstrap’s variant I use is buildd because it installs the build-essential package.

$ sudo mkdir /srv/chroot/gst-trusty64
$ sudo debootstrap --arch=amd64 --variant=buildd trusty ./gst-trusty64/ http://archive.ubuntu.com/ubuntu
$ sudo vim /etc/schroot/chroot.d

This is the schroot configuration I will use. Please, adapt it to your need.

[gst]
description=Ubuntu Trusty 64-bit for GStreamer
directory=/srv/chroot/gst-trusty64
type=directory
users=vjaquez
root-users=vjaquez
profile=default
setup.fstab=default/vjaquez-home.fstab

I am overrinding the fstab default file for a custom one where the home directory of vjaquez user aims to a clean directory.

$ mkdir -p ~/home-chroot/gst
$ sudo vim /etc/schroot/default/vjaquez-home.fstab
# fstab: static file system information for chroots.
# Note that the mount point will be prefixed by the chroot path
# (CHROOT_PATH)
#
#                
/proc           /proc           none    rw,bind         0       0
/sys            /sys            none    rw,bind         0       0
/dev            /dev            none    rw,bind         0       0
/dev/pts        /dev/pts        none    rw,bind         0       0
/home           /home           none    rw,bind         0       0
/home/vjaquez/home-chroot/gst   /home/vjaquez   none    rw,bind 0       0
/tmp            /tmp            none    rw,bind         0       0

configure chroot environment

We will get into the chroot environment as super user in order to add the required packages. For that pupose we add universe repository in apt.

  • libglib requires: autotools-dev gnome-pkg-tools libtool libffi-dev libelf-dev libpcre3-dev desktop-file-utils libselinux1-dev libgamin-dev dbus dbus-x11 shared-mime-info libxml2-utils
  • Python requires: libssl-dev libreadline-dev libsqlite3-dev
  • GStreamer requires: bison flex yasm python3-pip libasound2-dev libbz2-dev libcap-dev libdrm-dev libegl1-mesa-dev libfaad-dev libgl1-mesa-dev libgles2-mesa-dev libgmp-dev libgsl0-dev libjpeg-dev libmms-dev libmpg123-dev libogg-dev libopus-dev liborc-0.4-dev libpango1.0-dev libpng-dev libpulse-dev librtmp-dev libtheora-dev libtwolame-dev libvorbis-dev libvpx-dev libwebp-dev pkg-config unzip zlib1g-dev
  • And for general setup: language-pack-en ccache git curl
$ schroot --user root --chroot gst
(gst)# sed -i "s/main$/main universe/g" /etc/apt/sources.list
(gst)# apt update
(gst)# apt upgrade
(gst)# apt --no-install-recommends --no-install-suggests install \
autotools-dev gnome-pkg-tools libtool libffi-dev libelf-dev \
libpcre3-dev desktop-file-utils libselinux1-dev libgamin-dev dbus \
dbus-x11 shared-mime-info libxml2-utils \
libssl-dev libreadline-dev libsqlite3-dev \ 
language-pack-en ccache git curl bison flex yasm python3-pip \
libasound2-dev libbz2-dev libcap-dev libdrm-dev libegl1-mesa-dev \
libfaad-dev libgl1-mesa-dev libgles2-mesa-dev libgmp-dev libgsl0-dev \
libjpeg-dev libmms-dev libmpg123-dev libogg-dev libopus-dev \
liborc-0.4-dev libpango1.0-dev libpng-dev libpulse-dev librtmp-dev \
libtheora-dev libtwolame-dev libvorbis-dev libvpx-dev libwebp-dev \
pkg-config unzip zlib1g-dev

Finally we create our installation prefix. In this case /opt/gst to avoid the contamination of /usr/local and logout as root.

(gst)# mkdir -p /opt/gst
(gst)# chown vjaquez /opt/gst
(gst)# exit

compile ffmpeg 3.2

Now, let’s login again, but as the unprivileged user, to build the bundle, starting with ffmpeg. Notice that we are using ccache and building out-of-source.

$ schroot --chroot gst
(gst)$ git clone https://git.ffmpeg.org/ffmpeg.git ffmpeg
(gst)$ cd ffmpeg
(gst)$ git checkout -b work n3.2.12
(gst)$ mkdir build
(gst)$ cd build
(gst)$ ../configure --disable-static --enable-shared \
--disable-programs --enable-pic --disable-doc --prefix=/opt/gst 
(gst)$ PATH=/usr/lib/ccache/:${PATH} make -j8 install

compile glib 2.48

(gst)$ cd ~
(gst)$ git clone https://gitlab.gnome.org/GNOME/glib.git
(gst)$ cd glib
(gst)$ git checkout -b work origin/glib-2-48
(gst)$ mkdir mybuild
(gst)$ cd mybuild
(gst)$ ../autogen.sh --prefix=/opt/gst
(gst)$ PATH=/usr/lib/ccache/:${PATH} make -j8 install

install Python 3.5

Pyenv is a project that allows the automation of installing and executing, in the user home directory, multiple versions of Python.

(gst)$ curl -L https://github.com/pyenv/pyenv-installer/raw/master/bin/pyenv-installer | bash
(gst)$ ~/.pyenv/bin/pyenv install 3.5.0

Install meson 0.50

We will install the last available version of meson in the user home directory, that is why PATH is extended and exported.

(gst)$ cd ~
(gst)$ ~/.pyenv/verion/3.5.0/pip3 install --user meson
(gst)$ export PATH=${HOME}/.local/bin:${PATH}

build GStreamer 1.14

PKG_CONFIG_PATH is exported to expose the compiled versions of ffmpeg and glib. Notice that the libraries are installed in /opt/lib in order to avoid the dispersion of pkg-config files.

(gst)$ cd ~/
(gst)$ export PKG_CONFIG_PATH=/opt/gst/lib/pkgconfig/
(gst)$ git clone https://gitlab.freedesktop.org/gstreamer/gst-build.git
(gst)$ cd gst-build
(gst)$ git checkout -b work origin/1.14
(gst)$ meson -Denable_python=false \
-Ddisable_gst_libav=false -Ddisable_gst_plugins_ugly=true \
-Ddisable_gst_plugins_bad=false -Ddisable_gst_devtools=true \
-Ddisable_gst_editing_services=true -Ddisable_rtsp_server=true \
-Ddisable_gst_omx=true -Ddisable_gstreamer_vaapi=true \
-Ddisable_gstreamer_sharp=true -Ddisable_introspection=true \
--prefix=/opt/gst build --libdir=lib
(gst)$ ninja -C build install

test!

(gst)$ cd ~/
(gst)$ LD_LIBRARY_PATH=/opt/gst/lib \
GST_PLUGIN_SYSTEM_PATH=/opt/gst/lib/gstreamer-1.0/ \
/opt/gst/bin/gst-inspect-1.0

And the list of available elemente shall be shown.

archive the bundle

(gst)$ cd ~/
(gst)$ tar zpcvf gstreamer-1.14-x86_64-linux-gnu.tar.gz -C /opt ./gst

update your .travis.yml

These are the packages you shall add to run this generated GStreamer bundle:

  • libasound2-plugins
  • libfaad2
  • libfftw3-single3
  • libjack-jackd2-0
  • libmms0
  • libmpg123-0
  • libopus0
  • liborc-0.4-0
  • libpulsedsp
  • libsamplerate0
  • libspeexdsp1
  • libtdb1
  • libtheora0
  • libtwolame0
  • libwayland-egl1-mesa
  • libwebp5
  • libwebrtc-audio-processing-0
  • liborc-0.4-dev
  • pulseaudio
  • pulseaudio-utils

And this is the before_install and before_script targets:

      before_install:
        - curl -L http://server.example/gstreamer-1.14-x86_64-linux-gnu.tar.gz | tar xz
        - sed -i "s;prefix=/opt/gst;prefix=$PWD/gst;g" $PWD/gst/lib/pkgconfig/*.pc
        - export PKG_CONFIG_PATH=$PWD/gst/lib/pkgconfig
        - export GST_PLUGIN_SYSTEM_PATH=$PWD/gst/lib/gstreamer-1.0
        - export GST_PLUGIN_SCANNER=$PWD/gst/libexec/gstreamer-1.0/gst-plugin-scanner
        - export PATH=$PATH:$PWD/gst/bin
        - export LD_LIBRARY_PATH=$PWD/gst/lib:$LD_LIBRARY_PATH

      before_script:
        - pulseaudio --start
        - gst-inspect-1.0 | grep Total

Review of Igalia’s Multimedia Activities (2018/H2)

This is the first semiyearly report about Igalia’s activities around multimedia, covering the second half of 2018.

Great length of this report was exposed in Phil’s talk surveying mutimedia development in WebKitGTK and WPE:

WebKit Media Source Extensions (MSE)

MSE is a specification that allows JS to generate media streams for playback for Web browsers that support HTML 5 video and audio.

Last semester we upstreamed the support to WebM format in WebKitGTK with the related patches in GStreamer, particularly in qtdemux, matroskademux elements.

WebKit Encrypted Media Extensions (EME)

EME is a specification for enabling playback of encrypted content in Web bowsers that support HTML 5 video.

In a downstream project for WPE WebKit we managed to have almost full test coverage in the YoutubeTV 2018 test suite.

We merged our contributions in upstream, WebKit and GStreamer, most of what is legal to publish, for example, making demuxers aware of encrypted content and make them to send protection events with the initialization data and the encrypted caps, in order to select later the decryption key.

We started to coordinate the upstreaming process of a new implementation of CDM (Content Decryption Module) abstraction and there will be even changes in that abstraction.

Lighting talk about EME implementation in WPE/WebKitGTK in GStreamer Conference 2018.

WebKit WebRTC

WebRTC consists of several interrelated APIs and real time protocols to enable Web applications and sites to captures audio, or A/V streams, and exchange them between browsers without requiring an intermediary.

We added GStreamer interfaces to LibWebRTC, to use it for the network part, while using GStreamer for the media capture and processing. All that was upstreamed in 2018 H2.

Thibault described thoroughly the tasks done for this achievement.

Talk about WebRTC implementation in WPE/WebKitGTK in WebEngines hackfest 2018.

Servo/media

Servo is a browser engine written in Rust designed for high parallelization and high GPU usage.

We added basic support for <video> and <audio> media elements in Servo. Later on, we added the GstreamerGL bindings for Rust in gstreamer-rs to render GL textures from the GStreamer pipeline in Servo.

Lighting talk in the GStreamer Conference 2018.

GstWPE

Taking an idea from the GStreamer Conference, we developed a GStreamer source element that wraps WPE. With this source element, it is possible to blend a web page and video in a single video stream; that is, the output of a Web browser (to say, a rendered web page) is used as a video source of a GStreamer pipeline: GstWPE. The element is already merged in the gst-plugins-bad repository.

Talk about GstWPE in FOSDEM 2019

Demo #1

Demo #2

GStreamer VA-API and gst-MSDK

At last, but not the least, we continued helping with the maintenance of GStreamer-VAAPI and gst-msdk, with code reviewing and on-going migration of the internal library to GObject.

Other activities

The second half of 2018 was also intense in terms of conferences and hackfest for the team:


Thanks to bear with us along all this blog post and to keeping under your radar our work.